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CHAPTER 1 



INTRODUCTION 



1.1 Optimal, Predictive and Adaptive Control 

This book covers various topics related to the design of discrete-time control sys- 
tems via the Linear Quadratic (LQ) control methodology. 

LQ control is an optimal control approach whereby the control law of a given 
dynamic linear system — the so-called plant — is analytically obtained by mini- 
mizing a performance index quadratic in the regulation/tracking error and control 
variables. LQ control is either deterministic or stochastic according to the deter- 
ministic or stochastic nature of the plant. To master LQ control theory is important 
for several reasons: 

• LQ control theory provides a set of analytical design procedures that facilitate 
the synthesis of control systems with nice properties. These procedures, often 
implemented by commercial software packages, yield a solution which can be 
also used as a first cut in a trial and error iterative process, in case some 
specifications are not met by the initial LQ solution. 

• LQ control allows us to design control systems under various assumptions on 
the information available to the controller. If this includes also the knowledge 
of the reference to be tracked, feedback as well as feedforward control laws 

— the so-called 2-DOF controllers — are jointly obtained analytically. 

• More advanced control design methodologies, such as Tioo control theory, can 
be regarded as extensions of LQ control theory. 

• LQ control theory can be applied to nonlinear systems operating on a small 
signal basis. 

• There exists a relationship of duality between LQ control and Minimum- 
Mean-Squarc linear prediction, filtering and smoothing. Hence any LQ con- 
trol result has a direct counterpart in the latter areas. 

LQ control theory is complemented in the book with a treatment of multistep pre- 
dictive control algorithms. With respect to LQ control, predictive control basically 
adds constraints in the tracking error and control variables and uses the receding 
horizon control philosophy. In this way, relatively simple 2-DOF control laws can 
be synthesized. Their feature is that the profile of the reference over the prediction 
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horizon can be made part of the overall control system design and dependent on 
both the current plant state and the desired set point. This extra freedom can be 
effectively used so as to guarantee a bumpless behaviour and avoid to surpass sat- 
uration bounds. These are aspects of such an importance that multistep predictive 
control has gained wide acceptance in industrial control applications. 

Both the LQ and the predictive control methodologies assume that a model of 
the physical system is available to the designer. When, as often happens in prac- 
tice, this is not the case, we can combine on-line system identification and control 
design methods to build up adaptive control systems. Our study of such systems 
includes basically two classes of adaptive control systems: single-step-ahead self- 
tuning controllers and multistep predictive adaptive controllers. The controllers in 
the first class are based on simple control laws and the price paid for it originates 
the stability problems that they exhibit with nonminimum-phase and open-loop 
unstable plants. They also require that the plant I/O delay be known. The mul- 
tistep predictive adaptive controllers have a substantially greater computational 
load and require a greater effort for convergence analysis, but overcome the above 
mentioned limitations. 



1.2 About This Book 

LQ control is by far the most thoroughly studied analytic approach to linear feed- 
back control system design. In particular, several excellent textbooks exist on the 
topic. Considering also the already available books on adaptive control at various 
levels of rigour and generality, the question arises on whether this new addition 
to the pre-existing literature can be justified. We answer this question by listing 
some of the distinguishing features of this book. 

• The Dynamic Programming vs. the Polynomial Equation 
Approach 

LQ control, either in a deterministic or in a stochastic setting, is customarily ap- 
proached via Dynamic Programming by using a state-space or "internal" model 
of the physical system. This is a time domain approach and yields the desired 
solution in terms of a Riccati difference equation. For time invariant plants, the 
so-called steady-state LQ control law is obtained by letting the control horizon 
to become of infinite length, and, henceforth, can be computed by solving an al- 
gebraic Riccati equation. This steady-state solution can be also obtained via an 
alternative way, the so-called Polynomial Equation approach. This derives from a 
quite different way of looking at the LQ control problem. It uses transfer matrices 
or "external" models of the physical system, and turns out to be more akin to a 
frequency-domain methodology It leads us to solve the steady-state LQ control 
problem by spectral factorization and a couple of linear Diophantinc equations. 

In this book the Dynamic Programming and the Polynomial Equation approach 
are thoroughly studied and compared, our experience being that mastering both 
approaches can be highly beneficial for the student or the practitioner. Both ap- 
proaches play in fact a synergetic role, providing us with both two alternative ways 
of looking at the same problem and different sets of solving equations. As a conse- 
quence, our insight is enhanced and our ability in applying the theory strengthened. 
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• Predictive vs. LQ Control 

Multistep or long-range predictive control is an important topic for process control 
applications, some of the reasons having been outlined above. In this book the em- 
phasis is, however, on design techniques that are applicable when the plant is only 
partially known. Further, we study predictive control within the framework of LQ 
control theory. In fact, a predictive control law referred to as SIORHC (Stabilizing 
I/O Receding Horizon Control) is singled out by addressing a dynamic LQ control 
problem in the presence of suitable constraints on its terminal regulation/tracking 
error and on control signals. SIORHC has the peculiarity of possessing a guaranteed 
stabilizing property, provided that its prediction horizon length exceeds or equals 
the order of the plant. This finite-horizon stabilizing property makes SIORHC 
particularly well-suited for adaptive control wherein stabilization must be insured 
irrespective of the actual value of the estimated plant parameters. SIORHC, as 
its prediction horizon becomes larger, tightly approximates the steady-state LQ 
control, when the latter optimally exploits the knowledge of the future reference 
profile. However, thanks to the finite length of its prediction horizon, SIORHC can 
be more easily computable than steady-state LQ control, since it does not require 
the solution of an algebraic Riccati equation or a spectral factorization problem. 



• Single-Step-Ahead vs. Multistep Predictive Adaptive Con- 
trol 

One entire chapter is devoted to single-step-ahead self-tuning control. This is 
mainly done to introduce the subject of adaptive control and the tools for analysing 
more general schemes, our prevalent interest being in adaptive (multistep) predic- 
tive control systems because of their wider application potential. However, in 
going from single-step-ahead to more complicated control design procedures such 
as pole-assignment, LQ and some predictive control laws, a difficulty arises in that 
it may not be always feasible to evaluate the control law. A typical situation is 
when the estimated model has unstable pole-zero cancellations, i.e. the estimated 
model becomes unstabilizable. We refer to the above difficulty as the singularity 
problem. This has been one of the stumbling blocks for the construction of stable 
adaptive predictive control systems. The standard way to circumvent the singular- 
ity problem is to assume that the true plant parameter vector belongs to an a priori 
known convex set whose elements correspond to reachable plant models. Next, the 
recursive identification algorithm is modified so as to guarantee that the estimates 
belong to the set. E.g., this can be achieved by embodying a projection facility in 
the identification algorithm. The alleged prior knowledge of such a convex set is 
instrumental to the development of (locally) convergent algorithms, but it does not 
appear to be justifiable in many instances. In contrast with the above approach, in 
order to address convergence of adaptive multistep predictive control systems, an 
alternative technique is here adopted and analyzed. It consists of a self-excitation 
mechanism by which a dither signal is superimposed to the plant input whenever 
the estimated plant model is close to become unreachable. Under quite general 
conditions, the self-excitation mechanism turns off after a finite time and global 
convergence of the adaptive system is ensured. 
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• Implicit Adaptive Predictive Control 

One classic result in stochastic adaptive control is that an autoregressive moving 
average plant under Minimum- Variance control can be described in terms of a 
linear-regression model. This allows one to construct simple implicit self -tuning 
controllers based on the Minimum- Variance control law. In the book it is shown 
that a similar property holds also when multistep predictive control laws are used. 
Hence, implicit adaptive predictive controllers can be constructed, wherein simple 
linear-regression identifiers are used. The fact that no global convergence proofs 
are generally available — or even feasible — does not deter us from considering 
implicit adaptive predictive control in view of its excellent local self optimizing 
properties in the presence of neglected dynamics, and hence its possible use for 
autotuning reduced-complexity controllers of complex plants. 

1.3 Part and Chapter Outline 

In this section, we briefly describe the breakdown of the book into parts and chap- 
ters. The parts are three and they are listed below with some comments. 

Part I — Basic deterministic theory of LQ and predictive control This 
part consists of Chapters 2-5. The purpose of Chapter 2 is to establish the main 
facts on the deterministic LQ regulation problem. Dynamic Programming is dis- 
cussed and used to get the Riccati-based solution of the LQ regulator. Next, time- 
invariant LQ regulation is considered, and existence conditions and properties of 
the steady state LQ regulator are established. Finally, two simple versions of LQ 
regulation, based on a control horizon comprising a single step, are analyzed and 
their limitations are pointed out. Chapter 3 introduces the (^-representation of a 
sequence, matrix-fraction descriptions of system transfer matrices and system poly- 
nomial representations. Using these tools, a study follows on the characterization 
of stability of feedback linear systems and on the so-called YJBK parameterization 
of all stabilizing compensators. Finally, the asymptotic tracking problem is con- 
sidered and formulated as a stability problem of a feedback system. In Chapter 4, 
the polynomial approach to LQ regulation is addressed, the related solution found 
in terms of a spectral factorization problem and a couple of linear Diophantine 
equations, and its relationship with the Riccati-based solution established. Some 
remarks follow on robust stability of LQ regulated problems. Chapter 5 intro- 
duces receding horizon control. Zero terminal state regulation is first considered 
so as to develop dynamic receding horizon regulation with a guaranteed stabilizing 
property. Within the same framework, generalized predictive regulation is treated. 
Next, receding horizon iterations are introduced, our interest in them being moti- 
vated by their possible use in adaptive multistep predictive control. Finally, the 
tracking problem is discussed. In particular, predictive control is introduced as a 
2-DOF receding horizon control methodology whereby the feedforward action is 
made dependent on the future reference evolution which, in turn, can be selected 
on-line so as to avoid saturation phenomena. 

Part II — State estimation, system identification, LQ and predictive sto- 
chastic control This part consists of Chapter 6 and 7. The purpose of Chapter 6 
is to lay down the main results on recursive state estimation and system identifi- 
cation. The Kalman filter and various linear and pseudo-linear recursive system 
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parameter estimators are considered and related to algorithms derived systemati- 
cally via the Prediction Error Method. Finally convergence properties of recursive 
identification algorithms are studied. The emphasis here is to prove convergence 
to the true system parameter vector under some strong conditions which typically 
cannot be guaranteed in adaptive control. Chapter 7 extends LQ and predictive 
receding-horizon control to a stochastic setting. To this end, Stochastic Dynamic 
Programming is used to yield the optimal LQG (Linear Quadratic Gaussian) con- 
trol solution via the so called Certainty Equivalence Principle. Next, Minimum 
Variance control and steady-state LQ stochastic control for CARMA (Controlled 
AutoRegressive Moving Average) plants are tackled via the stochastic variant of 
the polynomial equation approach introduced in Chapter 3. Finally, 2-DOF track- 
ing and servo problems arc considered, and the stabilizing predictive control law 
introduced in Chapter 5, is extended to a stochastic setting. 

Part III — Adaptive control Chapter 8 and 9 combine recursive system 
identification algorithms with LQ and predictive control methods to build adaptive 
control systems for unknown linear plants. Chapter 8 describes the two basic groups 
of adaptive controllers, viz. model reference and self tuning controllers. Next, we 
point out the difficulties encountered in formulating adaptive control as an op- 
timal stochastic control problem, and, in contrast, the possibility of adopting a 
simple suboptimal procedure by enforcing the Certainty Equivalence Principle. We 
discuss the deterministic properties of the RLS (Recursive Least Squares) identi- 
fication algorithm not subject to persistency of excitation and, hence, applicable 
in the analysis of adaptive systems. These properties are used so as to construct 
a self-tuning control system, based on a simple one-step-ahead control law, for 
which global convergence can be established. Global convergence is also shown to 
hold true when a constant-trace RLS identifier with data normalization is used, 
the finite memory-length of this identifier being important for time-varying plants. 
Self tuning Minimum- Variance control is discussed by pointing out that implicit 
modelling of CARMA plants under Minimum Variance control can be exploited so 
as to construct algorithms whose global convergence can be proved via the stochas- 
tic Lyapunov equation method. Further, it is shown that Generalized Minimum- 
Variance control is equivalent to Minimum- Variance control of a modified plant, 
and, hence, globally convergent self-tuning algorithms based on the former con- 
trol law can be developed by exploiting such an equivalence. Chapter 8 ends by 
discussing how to robustify self-tuning single-step-ahead controllers so as to deal 
with plants with bounded disturbances and neglected dynamics. Chapter 9 studies 
various adaptive multistep predictive control algorithms, the main interest being 
in extending the potential applications beyond the restrictions inherent to single 
step-ahead controllers. We start with considering an indirect adaptive version of 
the stabilizing predictive control (SIORHC) algorithm introduced in Chapter 5. 
We show that, in order to avoid a singularity problem in the controller parameter 
evaluation, the notion of a self-excitation mechanism can be used. The resulting 
control philosophy is of dual control type in that the self-excitation mechanism 
switches on an input dither whenever the estimated plant parameter vector be- 
comes close to singularity. The dither intensity must be suitably chosen, by taking 
into account the interaction between the dither and the feedforward signal, so as to 
ensure global convergence to the adaptive system. We next discuss how the indirect 
adaptive predictive control algorithm can be robustificd in order to deal with plant 
bounded disturbances and neglected dynamics. The second part of Chapter 9 deals 
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with implicit adaptive predictive control. It first shows how the implicit modelling 
property of CARMA plants, previously derived under Minimum- Variance control, 
can be extended to more complex control laws, such as steady-state LQ stochastic 
control and variants thereof. Next, the possible use of implicit prediction models in 
adaptive predictive control is discussed and some examples of such controllers are 
given. One of such controllers, MUSMAR, which possesses attractive local self- 
optimizing properties, is studied via the Ordinary Differential Equation (ODE) 
approach to analyse recursive stochastic algorithms. Two extensions of MUSMAR 
are finally studied. Such extensions are finalized to recover exactly the steady- 
state LQ stochastic regulation law as an equilibrium point of the algorithm, and, 
respectively, to impose a mean-square input constraint to the controlled system. 

Appendices Results from linear system theory, polynomial matrix theory, linear 
Diophantine equations, probability theory and stochastic processes. 
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CHAPTER 2 



DETERMINISTIC LQ 

REGULATION - I 

RICC ATI-BASED SOLUTION 



The purpose of this chapter is to establish the main facts on the deterministic Linear 
Quadratic (LQ) regulator. After formulating the problem in Sect. 1, Dynamic 
Programming is discussed in Sect. 2 and used in Sect. 3 to get the Riccati-based 
solution of the LQ regulator. Sect. 4 discusses the time-invariant LQ regulation, 
the existence and properties of the steady-state regulator resulting asymptotically 
by letting the regulation horizon become infinitely large. Sect. 5 considers iterative 
methods for computing the steady-state regulator. In Sect. 6 and 7 two simple 
versions of LQ regulation, viz. Cheap Control and Single Step Control, are presented 
and analysed. 



2.1 The Deterministic LQ Regulation Problem 



The plant to be regulated consists of a discrete-time linear dynamic system repre- 
sented as follows 



Here: k € 7L := {• • • , — 1, 0, 1, • • •}; x(k) € R™ denotes the plant state at time k; 
u(k) £ R m the plant input or control at time k; and <&(fc) and G(k) are matrices 
of compatible dimensions. 

Assuming that the plant state at a given time to is x(to), the interest is to find 
a control sequence over the regulation horizon [to, T], to < T — 1, 



x(k + 1) = ®(h)x{k) + G(k)u(k) 



(2.1-1) 




(2.1-2) 



which minimizes the quadratic performance index or cost functional 




T-l r 



(2.1-3) 



k=t 



E \\x{k)\\l x{k) + ^'{k)M{k)x{k) + \\u{k)\\l 
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where := x'ipx, and the prime denotes matrix transposition. W.l.o.g., it will 

be assumed that tp x (k), ipu{k) and tp x (T) are symmetric matrices. 

Problem 2.1-1 Consider the quadratic form x'ipx, x £ R n , with ip any n X n matrix with 
real entries. Let ip s = ip' s := (tp + ip')/2. Show that x'ipx = x'ip s x. [Hint: Use the fact that 
i> = 4> s + i>s if ips := (V - V>')/2 ] 

J(to, 2;(io)j u [t ,T)) quantifies the regulation performance of the plant (1), from the 
initial event (t , x(t )) when its input is specified by W[t ,r)- It is assumed that any 
nonzero input u(fc) ^ O m is costly. This condition amounts to assuming that the 
symmetric matrix ip u (k) is positive definite 

^ u (fc)=<(fc)>0 (2.1-4) 

It is also assumed that the instantaneous loss at time A;, viz. the term within brackets 
in (3), is nonnegative 

£(k, x(k),u(k)) := + 2u'(k)M(k)x(k) + \\u(k)\\% u(k) > (2.1-5) 

Since by (4) ip u (k) is nonsingular, (5) is equivalent to the following nonnegative 
definitcness condition 

tp x (k) - M / (fc)^- 1 (fe)M(ife) > (2.1-6) 

Problem 2.1-2 Consider the quadratic form 

£(x,u) :=\\x\\l x +2u'Mx+\\u\\l u 

with x e R n , u £ R m , and tp u = ip' u > 0, an d ^ matrices of compatible dimensions. Shows 
that t(x, u) > for every (x, u) e R™ X R m if and only if tp x - M'tp^M > 0. [ffini: Find the 
vector «°(:r) £ R m which minimizes £(x,u) for any given x, viz. £(x,u°(x)) < i(x,u), u £ R m ] 

The terminal cost Ha^T)^ ^ is finally assumed nonnegative 

^(T) = <(T)>0 (2.1-7) 

Let us consider the following as a formal statement of the deterministic LQ 
regulation problem. 

Deterministic LQ regulator (LQR) problem Consider the linear plant 
(1). Define the quadratic performance index (3) with ip x (k), ip u (k), tp x (T) 
symmetric matrices satisfying (4), (6) and (7). Find an optimal input u® to t n 
to the plant (1), initialized from the event (to,x(to)), minimizing the perfor- 
mance index (3). 

The general LQR problem can be transformed into an equivalent problem with no 
cross-product terms in its instantaneous loss. In order to see this, set 

u(fc) = u(k) - K(k)x(k) (2.1-8) 

This means that the plant input u(k) at time k is the sum of —K(k)x(k), a state- 
feedback component, and a vector u(k). 
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Problem 2.1-3 Consider the instantaneous loss (5). Rewrite it as 

l(k,x(k),u{k)) :=e(k,x(k),u(k) - K(k)x(k)). 
Show that the cross-product terms in i vanish provided that 

K(k) = i,- 1 (k)M(k) (2.1-9) 

Show also that under the choice (9) 

e(k, x(fc), u(k)) = \\x(k)\\l x(k) + \\m\\l u(k) C 2 - 1 - 10 ) 

where xp x (k) equals the L.H.S. of (6) 

Mk) := Vx(fc) - M'{k)i,z' L {k)M{k) (2.1-11) 

Taking into account the solution of Problem 3, we can see that the general LQR 
problem is equivalent to the following. Given the plant 

x(k + 1) = [$(jfe) - G{k)ij;- 1 {k)M{k)}x{k) + G(k)u(k), (2.1-12) 

find an optimal input u® to T ^ minimizing the performance index 

T-l 

J(t ,x(t ),u [toX) ) = i(k,x(k),u(k)) + \\x(T)\\% w(T) (2.1-13) 

k=t 

where the instantaneous loss is given by (10). 

Problem 2.1-4 (An LQ Tracking Problem) Consider the plant (1) along with the n— dimensional 
linear system 

Xw(k + 1) = $> w (k)x w (k) 

with x w (to) g 1R™ given. Let 

x(k) := x(k) — x w (k) 

and 

j(t ,x(to),u [t0tT) \ := i(k,x(k),y-(k)) + \\x(T)\\l AT) 

<(*>*(*)>«(*)) ~ ll*(fc)llJ. W +2u'(fc)Af(fc)x(fc) + ||u(fc)||J u(fc) 
Show that the problem of finding an optimal input tij 1 ^ T j for the plant (1) which minimizes the 
above performance index can be cast into an equivalent LQR problem. [Hint: Consider the 
plant with "extended" state x(k) ■= I x'(k) x' w (k) ]'. ] 



2.2 Dynamic Programming 

A solution method which exploits in an essential way the dynamic nature of the 
LQR problem is Bellman's technique of Dynamic Programming. Dynamic Pro- 
gramming is discussed here only to the extent necessary to solve the LQR problem. 
In doing this, we consider a larger class of optimal regulation problems so as to 
better focus our attention on the essential features of Dynamic Programming. 
Let the plant be described by a possibly nonlinear state-space representation 

x{k+ 1) = f(k,x(k),u(k)) (2.2-1) 

As in (1-1), x(k) e R™ and u(k) e R m . The function /, referred to as the local 
state-transition function, specifies the rule according to which the event (k,x(k)) 
is transformed, by a given input u(k) at time k, into the next plant state x(k + 1) 
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at time k + 1. By iterating (1), it is possible to define the global state-transition 
function 

x{j) = ip(j,k,x(k),u [kJ ^J , j > k (2.2-2) 

The function p, for a given input sequence U[ k j), j > k, specifies the rule according 
to which the initial event (k, x(k)) is transformed into the final event (j, x(j)). E.g. 

x(k + 2) = f(k+l,x{k+l),u{k+l)) 

= f(k+l,f(k,x(k),u(k)),u(k+l)) [(1)] 
=: ip(k + 2,k,x(k),u [klk +2)) 

For j = k, U[ k ,j) is empty, and, consequently, the system is left in the event (k, x(k)). 
This amounts to assuming that ip satisfies the following consistency condition 

(p(k,k,x(k),u [k , k) ) = x(k) (2.2-3) 

Problem 2.2-1 Show that for the linear dynamic system (1-1), the global state-transition 
function equals 

3-1 

ip(j,k,x(k),u [kJ) ) = &(j,k)x(k) + J2 ®(j,i + l)G(i)u(i) 

i— k 

where 

-!)...•(*) 7>l {2 - 2 - 4) 

is the state-transition matrix of the linear system. 

Along with the plant (1) initialized from the event (to,x(to)), we consider the 
following possibly nonquadratic performance index 

T-l 

J(i , ar(t ), u [t0 ,r)) = E ^ fc < u ( fc )) + ^( X ( T )) ( 2 - 2 " 5 ) 

k=t 

Here again £(k,x(k),u(k)) stands for a nonnegative instantaneous loss incurred at 
time k, ?p(x(T)) for a nonnegative loss due to the terminal state x{T), and [io,? 1 ] 
for the regulation horizon. The problem is to find an optimal control u® to T ^ for the 
plant (1), initialized from (to,x(to)), minimizing (5). 

Hereafter, conditions on (1) and (5) will be implicitly assumed in order that each 
step of the adopted optimization procedure makes sense. For t € [to,? 1 ], consider 
the so called Bellman function 



mm < mm 

"[Mi) u lH,T) 



V(t,x(t)) := min J[t,x(t),m t T) ) 

u [t,T) V / 
"tl-1 

^2i{k,u{k),x{k))+ (2.2-6) 

k=t 

J(tl,<p(tl,t,x(t),U[ t>tl )), U [tuT )) | 

The second equality follows since W[t,T) = u [t,t t ) ® u [ti.T) for *i € [t,T), ® denoting 
concatenation. Eq. (6) can be rewritten as follows 

ti-i 



V(t,a;(t)) = min <^ V £(k,u{k),x{k)) + 
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™% J (* 1 'K* 1 '*' a; ^' U[ *' tl) )' "I* 1 '^)} (2.2-7) 
, ti-i 

min ^ ^ f(fc,u(fc),a;(fc)) + y (ti, <p <c(t), it [t)tl) )) 



'■*[Mi) 



k=t 



Suppose now that u° t T -, is an optimal input over the horizon [t, T) for the initial 
event (t, x(t)), viz. 

V{t,x{t)) = j(t,x{t),uf tiT ^ 

< j(t,x(t),U [t>T )^ 

for all control sequences u^ t T y Then, from (7) it follows that u9 ti T y the restriction 
of u° t T ) to [ti,T), is again an optimal input over the horizon [t\,T) for the initial 
event {ti,x{t\)), x{t{) := ip{t\, t, x(t), u^^), viz. 

Vituxih)) = J(t 1>a ;(ti),uji 1>r) ) 

< J(*i,o;(ii,ti[ tl)T )) 

The above statement is a way of expressing Bellman's Principle of Optimality. In 
words, 

the Principle of Optimality states that an optimal input sequence u° t T ^ 
is such that, given an event (t\,x(t\)) along the corresponding optimal tra- 
jectory, x(t\) — tp(ti,to,x(to),u® to ti j), the subsequent input sequence u® ti T ^ 
is again optimal for the cost-to-go over the horizon [t\ , T] . 

For ti=t+l, (7) yields the Bellman equation 

V(t, x{t)) = min \t{t, x(t),u(t)) + V(t + 1, f(t, x(t),u(t)))\ (2.2-8) 

with the terminal event condition 

V{T,x{T))=${x{T)) (2.2-9) 
The functional equation (8) can be used as follows. Eq. (8) for t = T — 1 gives 

V(T-l,x(T-l)) = min \t(T — 1, x(T — 1), u(T — 1)) + i/j(x(T))\ 

u(T—l) L J 

x(T) = /(T-l,a;(T-l),u(T-l)) (2.2-10) 

If this can be solved w.r.t. u(T — 1) for any state x(T — 1), one finds an optimal 
input at time T — 1 in a state-feedback form 

u°(T-l)=u°(T-l,x(T-l)) (2.2-11) 

and hence determines V(T — l,x(T — 1)). By iterating backward the above pro- 
cedure, provided that at each step a solution can be found, one can determine an 
optimal control law in a state-feedback form 

u°(k) = u°(k,x(k)) , ke[t ,T) (2.2-12) 
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and V(k,x(k)). 

Before proceeding any further, let us consolidate the discussion so far. We have 
used the Principle of Optimality of Dynamic Programming to obtain the Bellman 
equation (8). This suggests the procedure outlined above for obtaining an optimal 
control. It is remarkable that, if a solution can be obtained, it is in a state-feedback 
form. Next theorem shows that, provided that the procedure yields a solution, it 
solves the optimal regulation problem at hand. 

Theorem 2.2-1. Suppose that {V(t, x)}J =to satisfies the Bellman equation (8) 
with terminal condition (9). Suppose that a minimum as in (8) exists and is at- 
tained at 

u(t) — u(t, x) 

Viz. 

£(t, x, u(t)) + V(t+1, f(t, x, u(t))) < £(t, x, u) + V(t + l,f(t, x, u)) , Vw € R m . 
Define x? to r i and u? to t n recursively as follows 

x°(t ) = x(t ) (2.2-13) 

u°(t) = u(t,x°(t)) \ 
x°(t+l) = f(t,x°(t),u°(t)) / *-*o,*o + l,-",r-l (2.2-14) 

Then t ^ is an optimal control sequence, and the minimum cost equals V(t , x(t ))- 

Proof It is to be shown that, if t ^ is defined as in (14), 

V(to,s(*o)) = j(t ,x(t ),u^ toT A (22 15) 

for all control sequences uy ta T y 

Since for x(t) = x°(t), the R.H.S. of (8) attains its minimum at u°(t), one has 

V(t, x°(t)) = i(t, x°(t), u°(t)) + V(t + l,x°(t + 1)) (2.2-16) 

Hence 

y(*o,x(t )) - V (T,x°(T)) = (2.2-17) 

T-l 

= ^ [V(t,x°(t)) -V(t + l,x°(t + l))] 
t=t 

T-l 

= t{t,x°(t),u°(t)) 
t=t 

Since V(T,x°(T)) = tp(x(T)), the equality in (15) follows. 

Now for every control sequence M[t ,T) applied to the plant initialized from the event (io,x(£o)), 
one has 

V(t, x(t)) < t(t, x(t), «(*)) + V(t + 1, x(t + 1)) (2.2-18) 

if 

x(t + l) = f(t,x(t),u(t)) 

= tp(t,to,x(to),U[ t0iT )^ 

Using (18) instead of (16), one finds the next inequality in place of (17) 

T-l 

V(t ,x(t )) < Yl t(t,x(t),u(t))+i>x(x(T)) 
t=t 

= j(t ,s(to),t*[ to ,T))- (2.2-19) 



Sect. 2.3 Riccati-Based Solution 



15 



(to,x(t )) 



«(*) = 
x(t)) 
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Plant 
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x{t) = 

<p{t,t ,x(t ),U[ tOt t)) 



Figure 2.2-1: Optimal solution of the regulation problem in a state-feedback form 
as given by Dynamic Programming. 



Main points of the section Bellman equation of Dynamic Programming, if 
solvable, yields, via backward iterations, the optimal regulator in a state-feedback 
form (Fig. 1). 



2.3 Riccati— Based Solution 



The Bellman equation (2-8) is now applied to solve the deterministic LQR prob- 
lem of Sect. 2.1. In this case, the plant is as in (1-1), the performance index 
as in (1-3) with the instantaneous loss as in (1-5). Taking into account (1-7), 
one sees that V(T, x(T)), the Bellman function at the terminal event, equals the 
quadratic function x' (T)ip x (T)x(T) with the matrix ip x (T) symmetric and nonncg- 
ative definite. By adopting the procedure outlined in Sect. 2.2 to compute backward 
V(t, x(t)), t = T — 1, T — 2, • • • , to, the solution of the LQR problem is obtained 

Theorem 2.3-1. The solution to the deterministic LQR problem of Sect. 2.1 is 
given by the following linear state-feedback control law 

u(t) = F(t)x(t) , t€[to,T) (2.3-1) 

where F(t) is the LQR feedback-gain matrix 

F(t) = - [^(t) + G'(t)V(t + l)G(t)] _1 [M (t) + G'(t)V(t + l)$(i)] (2.3-2) 

and V(t) is the symmetric nonnegative definite matrix given by the solution of the 
following Riccati backward difference equation 



V{t) = &(t)V{t+l)$(t)- M'(t) + &{t)V(t+l)G(t) 

'^{t)+G\t)V{t+\)G{t)^ 1 x 
M(t) + G'(t)T(t + + V* (t) 



(2.3-3) 



F'{t) U u + G'(t)P(t + l)G(t)l F(t) + Mt) (2-3-4) 
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= $(t) + G(t)F(t)] 'v(t + 1) [$(t) + G(i)F(t)l + 

F'{t)ip u {t)F{t) + M'{t)F{t) + F'{t)M{t) + (2.3-5) 

wzi/i terminal condition 

V{T) = tp x {T) (2.3-6) 

Further, 

V(t,x(t)) = mm U[t T) j(t,x(t),u [ttT) ^ 



(2.3-7) 



= x'{t)V(t)x{t) 



Proof (by induction) It is known that V(T, x(T)) is given by (7) if V(T) is as in (6). Next, 
assume that V(t + l,x(t + 1)) = \\x(t + with V(t + 1) = V'(t + 1) > and x(t + 1) = 

*(t)x(t) + G(t)u(t). Show that V(t,x(t)) = ||x(t)|||, (t) with V(t) satisfying (3). One has 
V(t,x(t)) = min j(t,x(t),u(t)) 

u(t) 

:= mm\\\x(t)\\ 2 (t +2u'(t)M(t)x(t)+ (2.3-8) 

u(t) { 

ll«(*)HL(t) + + G(t)«(t)||?, (t+1) 

Let u(t) = [ui(t) ■ ■ ■ u m (t)]' . Set to zero the gradient vector of J w.r.t. u(t) 

_dj(t,x(t),u(t)) if ai ai ]' 



29u(t) 2La M1 (t) 9u m (t)J 

= [M(t) + G'(t)V(t + l)*(t)]x(t) + (2.3-9) 
ty>«(t)+G'(t)7>(t + l)G(t)]u(t) 

This yields (1) and (2). That these two equations give uniquely the optimizing input u(t), it 
follows from invertibility of [ipu(t) + G'(t)V(t + l)G(t)] and positive definiteness of the Hessian 
matrix 

a2J( y;ir W) = ^ + + ^mi > * 

Substituting (1) and (2) into J(t, x(t), «(t)), (7) is obtained with V(t) satisfying (3). Eq. (3) shows 
that V(t) is symmetric. 

To complete the induction it now remains to show that V(t) is nonnegative definite. Rewrite 
(4) as follows 

P(t) = $'(t)7>(t + l)[$(t) + G(t)F(t)] + M'(t)F(t)+-4> x (t) (2.3-10) 

Further, premultiply both sides of (2) by F' (t)[tp u (t) + G'(t)V(t + l)G(t)] to get 

F'(t)f u (t)F(t) = -F'{t)M(t)- 

F'(t)G'{t)V(t + 1) + G(t)F(t)] (2 - 3_11) 

Subtracting (11) from (10), we find (5). Next, by virtue of (1-6), 

F'(t)ip u {t)F{t) + M'(t)F(t) + F'(t)M(t) + ^ x (t) > 

F'(t)ip u (t)F(t) + M'(t)F(t) + F'(t)M(t) + M'(t)rl>- l {t)M(t) = (2.3-12) 
[F'(t) + M'it)^- 1 (t)] tf„(t) [F(t) + V^ 1 (t)M(i)] 

From (5), (12) and nonnegative definiteness of V{t + 1), P(t) is seen to be lowerbounded by the 
sum of two nonnegative definite matrices. Hence, V(t) is also nonnegative definite. 



Main points of the section For any horizon of finite length the LQR problem 
is solved (Fig. 1) by a regulator consisting of a linear time-varying state-feedback 
gain matrix F(t), computable by solving a Riccati difference equation. 
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Figure 2.3-1: LQR solution. 

2.4 Time-Invariant LQR 

It is of interest to make a detailed study of the LQR properties in the time-invariant 
case. In this case, the plant (1-1) and the weights in (1-3) are time-invariant, viz. 

= $; G(k) = G; xp x (k) = V*; M k {k) = M; and ip u (k) = ip u , for k = 
t ,---,T-l. 

By time-invariance, we have for the cost (1-3) 



J 



(*o,a;,«[to,T)) = ^(o,a;,U[o,jv)) 



(2.4-1) 



where 



u(-) := u(- + t ) and N:=T-t 

for any a; € R™ and input sequence u(-). In (1) u(- + to) indicates the sequence u(-) 
anticipated in time by to steps. The notation can be further simplified, by rewriting 

(1) as J^x, U[o,at)^ where it is understood that x denotes the initial state of the 
plant to be regulated, and N the length of the regulation horizon. The following is 
a restatement of the deterministic LQR problem in the time-invariant case. 



LQR problem in the time— invariant case Consider the time-invariant 
linear plant 

x(k + l) = $jc(fc) + Gu(k) \ 
x(0) = x J 



along with the quadratic performance index 

N-l 

j(x,U[o,N)) ■= ^2^(k),u(k)) + \\x(N)\\ 

k=0 

£(x, u) := \\x\\^ x + 2u'Mx + \\u\\^ u 
where tp x , ip u , ip x (N) are symmetric matrices satisfying 

V>« = <>o 

$ x := i> x -M'^M>{) 



2 



(2.4-2) 



(2.4-3) 
(2.4-4) 

(2.4-5) 
(2.4-6) 
(2.4-7) 
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Find an optimal input u°(-) to the plant (2) with initial state x, minimizing 
the performance index (3) over an iV-steps regulation horizon. 

For any finite N, Theorem 3-1 provides, of course, the solution to the problem (2)- 
(7). Here, the solution depends on the matrix sequence {V(t)}^L which can be 
computed by iterating backward the matrix Riccati equation (3-3)-(3-5). Equiva- 
lently, by setting 

P(j):=V(N-j) , j = 0,1, •••,7V (2.4-8) 
we can express the solution via Riccati forward iterations as in the next theorem 

Theorem 2.4-1. In the time-invariant case, the solution to the deterministic LQR 
problem (2)-(7) is given by the following state-feedback control 



3 



u(N - j) = F(j)x(N - j) 
where F(j) is the LQR feedback matrix 

F(j) = -[yju + G'P(j - rjGTMM + G'P(j - 1)$] 



(2.4-9) 



(2.4-10) 



and P(j) is the symmetric nonnegative definite matrix solution of the following 
Riccati forward difference equation 



P(j) = $'P(j-l)$- M' + <f>'P(j -1)G 



i) u + G'P(j-l)G M + G'P(j-l)$ 



= - 1)* - F'(j) [V„ + G'P(j - 1)G\ F(j) + ^ 

= "$ + GF{j)] 'P{j - 1) [* + GF{j)] + 
F'{j)i> u F{j) + M'F(j) + F'(j)M + 1> x 

with initial condition 



(2.4-11) 
(2.4-12) 

(2.4-13) 
(2.4-14) 



P(0) - MN) 

Further, the Bellman function Vj(x), relative to an initial state x and a j-steps 
regulation horizon, with terminal state costed by P(0), equals 



Vi(x) 



mm 



n ^ j(x,U[ N _ jjN ^j 

min J[ x, U\q j\ ) 
= x'P(j)x 



(2.4-15) 



Our interest will be now focused on the limit properties of the LQR solution 
(9)-(15) as j — ► co, i.e. as the length of the regulation horizon becomes infinite. The 
interest is motivated by the fact that, if a limit solution exists, the corresponding 
state-feedback may yield good transient as well as steady-state regulation proper- 
ties to the controlled system. 

We start by studying the convergence properties of P(J) as j — > co. As next 
example 1 shows, the limit of P(J) for j — -> oo need not exist. In particular, we see 
that some stabilizability condition on the pair ($, G) must be satisfied if the limit 
has to exist. 
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Example 2.4-1 Consider the plant (2) with 



1 1 




1 


G = 


2 








(2.4-16) 



For the pair ($,G), 2 is an unstable unreachable eigenvalue. Hence, ("J, G) is not stabilizablc. 
Let X2 (fc) be the second component of the plant state x(k). It is seen that x 2 (k) is unaffected by 
«(•). In fact, it satisfies the following homogeneous difference equation 

x 2 (k + l) = 2x 2 (k) (2.4-17) 

Consider the performance index (3) with ^ X {N) = 2 x2 and instantaneous loss 

i(x,u) = x\ +u 2 . (2.4-18) 

Assume that the corresponding matrix sequence {P (j)}j?- admits a limit as j — + oo 

lim P(j) = P(oo) < M (2.4-19) 

Then, according to (15), there is an input sequence for which 

3-1 

lim 



lim j|i,«[ 0j )j = lim j^lW + u2 (^)j 
x' P(co)x < oo 



However, last inequality contradicts the fact that the performance index (3), with £(x,u) as in 
(18) and X2(k) satisfying (17), diverges as j — » oo for any initial state x G R 2 such that X2 ^ 0, 
irrespective of the input sequence. Therefore, by contradiction, we conclude that the limit (19) 
does not exist. 

Next Problem 1 applies the results of Theorem 1 to the plant (2) when G = O n xm 
and $ is a stability matrix 

Problem 2.4-1 Consider the sequence {x(k)}^~Q satisfying the difference equation 

x(k + 1) = <Sfx{k) (2.4-20) 

Show that 

£ \\x{k)\\% x = \\m\\ 2 C{N) (2-4-21) 
k=0 

where C(N) is the symmetric nonnegative definite matrix obtained by the following Lyapunov 
difference equation 

C(j + 1) = *'£(?)* + 4> x , j = 0,l,--- (2.4-22) 
initialized from £(0) = Onxm- Next, show that the following limits exist 



JV-l 

lim IW fc )l&. = IW°)ll!c(oo) ( 2 - 4 - 23 ) 

fe=0 

lim C(N) =: £(oo), (2.4-24) 
provided that <I> is a stability matrix, i.e. 

|A($)| < 1 (2.4-25) 

if A(«I>) denotes any eigenvalue of <!>. Finally show that C(oo) satisfies the following (algebraic) 
Lyapunov equation 

C(oo) = <J>'£(oo)<!> + ip x (2.4-26) 
That (26) has a unique solution under (25), it follows from a result of matrix theory [Fra64]. 

Next lemma will be used in the study of the limiting properties as j —> oo of the 
solution P(j) of the Riccati equation (11)— (13) 

Lemma 2.4-1. Let {P(j)}j^ be a sequence of matrices in R" x ™ such that: 
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i. every P(j) is symmetric and nonnegative definite 

P(j) = P'(j) > (2.4-27) 

ii. {-P(j)}^L is monotonically nondecreasing, viz. 

i<3 P(i)<P(j) (2.4-28) 

iii. {P(j)}J^Q is bounded from above, viz. there exists a matrix Q € R" xn such 
that, for every j, 

P(j) < Q (2.4-29) 
Then, {P(j)}j^ admits a symmetric nonnegative definite limit P as j — > oo 

lim P(j) = P (2.4-30) 

j—yoo 

Proof For every x £ R™ the real— valued sequence {a(j)} := {x' P(j)x} is, by ii., monotonically 
nondecreasing and, by iii., upperbounded by x'Qx. Hence, there exists linij^oo a(j) = a. Take 
now x = ei where ej is the i-th vector of the natural basis of R™. Thus, with such a choice, 
x'P(j)x = Pu(j) if Pik denotes the (i, fc)-entry of P. Hence, wc have established that there exist 

lim Pa(j) = Pa , i = l,---,n 

Next, take x = a + e^. Under such a choice, x' P(j)x = Pu(j) + 2P i i c (j) + Pkk(j)- This admits a 
limit as j — > oo. Since lim.,^^ Pu(j) = Pa and lim^oo PkkU) = Pfcfc> there exists 

lim P ik (j) = Pik 

Since we have established the existence of the limit as j ' ^ oo of all entries of P(j), and P(j) 
satisfies (27), it follows that P exists symmetric and nonnegative definite. 

We show next that the solution of the Riccati iterations (11)— (13) initialized from 
P(0) = O nxn enjoys the properties i. — iii. of Lemma 1, provided that the pair ($, G) 
is stabilizable. 

Proposition 2.4-1. Consider the matrix sequence {-P(j)}^ generated by the Ric- 
cati iterations (11)-(13) initialized from P(0) — O nxn . Then, 
{P(j)}f^o en j°y s the properties i.-iii. of Lemma 2.4-1, provided that ($, G) is 
a stabilizable pair. 

Proof Property i. of Lemma 1 is clearly satisfied. To prove property ii. we proceed as follows. 
Consider the LQ optimal input u® for the regulation horizon [0, j + 1] and an initial plant 

state x. Let x® Q be the corresponding state evolution. Then, 

j'-l 

x'P(j + l)x = ^^ (fc),« (fe))+^ (j),«°(j)) 
fe=0 
3-1 

> e(x°(k),u°(k)) (2.4-31) 
fc=0 

3-1 

> min V £(x(k),u(k)) = x'P(j)x 
"l°-> to 

Hence, {.P(j)}jLo is monotonically nondecreasing. 

To check property iii., consider a feedback-gain matrix F which stabilizes <!>, viz. <I> + GF is 
a stability matrix. Let 

u(k) = Fx(k) , k = 0, 1, ••• (2.4-32) 
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and correspondingly 

x(k + l) = (* + GF)£(fc) 
x(0) = x. 

Recall that by (3-12), tp x + F' M + M' F + F' 'ip u F is a symmetric and nonnegative definite matrix. 
Then, by Problem 1, there exists a matrix Q = Q' > 0, solution of the Lyapunov equation 

Q = (* + GF)'Q(* + GF) + V,; + F'M + M'F + F'V«F (2.4-34) 

and such that 

oo 

ie'Qx = ^ t{x{k),u{k)) 

fc=0 
2-1 

> t{x{k),u{k)) (2.4-35) 

fc=0 

2-1 

> mkV<(i(fc),«(t))=i'P(j> 
Hence, {F(j')}?2.q is upperbounded by Q. 

Proposition 1 , together with Lemma 1 , enables us to establish a sufficient condition 
for the existence of the limit of P(j) as j — > oo. 

Theorem 2.4-2. Consider the matrix sequence {P(j)}jZ generated by the Riccati 
iterations (11)-(13) initialized from P(0) = O nxn - Then, if (3>,G) is a stabilizable 
pair, there exists the limit of P(j) as j — > oo 

P := lim P(j) (2.4-36) 

j—*oo 

P is symmetric nonnegative definite and satisfies the algebraic Riccati equation 

P = $'P$- (2.4-37) 
(M' + &PG)(ip u + G'PG)- 1 {M + G"P$) + V* 

= $'P$ - F^u + G'PG^F + tp x (2.4-38) 

= (<f> + GFyP(<f> + GF) + F'ip u F + M'F + F'M + ip x (2.4-39) 

with 

F = -{4> u + G'PG)- 1 (M + G'P$) (2.4-40) 
Under the above circumstances, the infinite-horizon or steady-state LQR, for which 

min j(x,Ufooo)) =x'Px, (2.4-41) 

M[0,oo) V ' ' 1 

is given by the state-feedback control 

u{k) = Fx{k) fc = 0,l,--- (2.4-42) 

It is to be pointed out that Theorem 1 does not give any insurance on the 
asymptotic stability of the resulting closed-loop system 

x{k+ 1) = ($ + GF>(fc) (2.4-43) 

Stability has to be guaranteed in order to make the steady-state LQR applicable 
in practice. We now begin to study stability of the closed-loop system (43) , should 
P(j) admit a limit P for j — > oo. For the sake of simplicity, this study will be 
carried out with reference to the Linear Quadratic Output Regulation (LQOR) 
problem defined as follows. 
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LQOR problem in the time— invariant case Here the plant is described 
by a linear time-invariant state-space representation 



x(k+l) = $x{k) + Gu{k) 
x(0) = x 
y(k) = Hx{t) 



(2.4-44) 



where y{k) € R p is the output to be regulated at zero. A quadratic perfor- 
mance index as in (3) is considered with instantaneous loss 



£(x,u) :=\\y\\l y + \\u\\l u 



where tjj u satisfies (5) and 



(2.4-45) 
(2.4-46) 



Since in view of (44) \\y\ 



4>y 



TP X = H'lJjyH, 



(2.4-47) 



it appears that the LQOR problem is an LQR problem with M = 0. However, 
we recall that, by (1-8)— (1-13), each LQR problem can be cast into an equivalent 
problem with no cross-product terms in the instantaneous loss. In turn, any state 
instantaneous loss such as \\x\\^ x can be equivalently rewritten as ||y||^, , y = Hx 
and ip y — ip' y > 0, if H and tp y are selected as follows. Let rank ip x = p < n. Then 
there exist matrices H £ R px ™ and tp y — ip' > such that the factorization (47) 
holds. Any such a pair (H, tp y ) can be used for rewriting \\x\\^ x as ||y||^, . Therefore, 
we conclude that, in principle, there is no loss of generality in considering the LQOR 
in place of the LQR problem. 

For any finite regulation horizon, the solution of the LQOR problem in the time- 
invariant case is given by (9)-(15) of Theorem 1, provided that M = and ip x is 
as in (47). An advantage of the LQOR formulation is that the limiting properties 
as N — > oo of the LQOR solution can be nicely related to the system-theoretic 
properties of the plant S = (<f>, G, H) in (44). 

Problem 2.4-2 Consider the plant (44) in a Gilbcrt-Kalman (GK) canonical observability 
decomposition 



*o 



G : 



Go 
Go 



(2.4-48) 

H=[ H ] x = [ x' Q x' s ]' 

It is to be remarked that this can be assumed w.l.o.g. since any plant (44) is algebraically equivalent 
to (48). With reference to (10)-(15) with M = and i/> x = H'ip y H, show that, if 

P o (0) 



then 



with 



P(0): 



P(3) 











Po(j) 





PoU + 1) 



<P (j)*o- 

KPo(j)G i>u + G' P (j)Go G' P (j)$ + H'^yHo 



(2.4-49) 



(2.4-50) 



(2.4-51) 
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and 

F(j) = [ Fo(j) ] (2.4-52) 

with 

Fo(j) = -H>u + G' P (j - ^Go^G^PoU - l)*o (2.4-53) 

Expressing in words the conclusions of Problem 2, we can say that the solution of 
the LQOR problem depends solely on the observable subsystem S Q = (<i> , G , H ) 
of the plant, provided that only the observable component x (N) of the final state 
x(N) = [ x' {N) 4 (AO ]' i s costed. 

For the time-invariant LQOR problem, next Theorem 2 gives a necessary and 
sufficient condition for the existence of P in (36) . 

Theorem 2.4-3. Consider the time-invariant LQOR problem and the correspond- 
ing matrix sequence {-P(j)}^Lo generated by the Riccati iterations (11)-(13), with 
M = O n xn, initialized from P(0) = O mxn . Let S Q = (& ,G ,H ) be the com- 
pletely observable subsystem obtained via a GK canonical observability decomposi- 
tion of the plant (44) S = (^G, H). Next, let <& of the state transition matrix of 
the unreachable subsystem obtained via a GK canonical reachability decomposition 
o/S . Then, there exists 

P = lim P(j) (2.4-54) 

j — >00 

if and only if <fr f is a stability matrix. 

Proof According to Problem 2, everything depends on E . Thus, w.l.o.g., we can assume that 
the plant is E . To say that <I> r is a stability matrix is equivalent to stabilizability of E . Then, 
by Theorem 1, the above condition implies (54). 

We prove that the condition is necessary by contradiction. Assume that <I> r is not a stability 
matrix. Therefore, there are observable initial states of the form x = [ x' or = x' of ] such 

that X]fc=o H^C^II^x di ver g es as 3 ~* °°; irrespective of the input sequence. This contradicts (54). 

The reader is warned of the right order for the GK canonical decompositions that 
must be used to get <fr of in Theorem 2. 



Example 2.4-2 Consider the plant E = (<&,G,H) with 



1 1 

2 



G ■■ 



H=[l ] 



S is seen to be completely observable. Hence, we can set E = S . Further, E is already in a GK 
reachability canonical decomposition with <J> r = 2. Hence, we conclude that the limit (54) does 
not exist. 

If we reverse the order of the GK canonical decompositions, we first get the unreachable Ef 
of E. It equals Ef = (2, 0, 0) which is unobservable. Then, 4>r is "empty" (no unreachable and 
observable eigenvalue). Hence, we would erroneously conclude that the limit (54) exists. 

Problem 2.4-3 Consider the LQOR problem for the plant E = (&,G,H). Assume that the 
matrix & f, defined in Theorem 3, is a stability matrix. Then, by Theorem 3, there exists P as in 
(54). Prove by contradiction that P is positive definite if and only if the pair (<E>, H) is completely 
observable. [Hint: Make use of (50) and positive definitcness of ij>y.\ 

Theorem 2.4-4. Consider the time-invariant LQOR problem and the correspond- 
ing matrix sequence {P(j)}J^ generated by the Riccati iterations (11)-(13), with 
M = O mxn , initialized from P(0) = O nxn . Then, there exists 

P = lim P(j) 

j— >oo 
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such that the corresponding feedback-gain matrix 

F = -(tp u + G'PG^G'P® (2.4-55) 

yields a state-feedback control law u(k) = Fx{k) which stabilizes the plant, viz. 
<f> + GF is a stability matrix, if and only if the plant X ~ (<f>, G, H) is stabilizable 
and detectable. 



Proof We first show that stabilizability and detectability of S is a necessary condition for the 
existence of P and stability the corresponding closed-loop system. 

First, &+GF stable implies stabilizability of the pair G). Second, necessity of detectability 
of ( < J>,ff) is proved by contradiction. Assume, then, that (Q,H) is undetectable. Referring to 
Problem 2, w.l.o.g. (<!>, G, H) can be considered in a GK canonical observability decomposition 
and, according to (52), F = [ F ]. Hence, the unobservable subsystem of S is left unchanged 
by the steady-state LQ regulator. This contradicts stability of $ + GF. 

We next show that stabilizability and detectability of £ is a sufficient condition. Since the 
pair (<!>, G) is stabilizable, by Theorem 1 there exists P. Further, according to Problem 2, the 
unobservable eigenvalues of S are again eigenvalues of <E> + GF. Since by detectability of (<!>, H) 
they are stable, w.l.o.g. we can complete the proof by assuming that (<J>, H) is completely ob- 
servable. Suppose now that <I> + GF is not a stability matrix, and show that this contradicts 
(54). To see this, consider that complete observability of (<£>, H) implies complete observability of 

^<E> + GF, [ F' H' ]'J for any F of compatible dimensions. Then, if $ + GF is not a stability 
matrix, there exists states x(0) such that 



3-1 

E 

fc=0 



Ilv(*)ll4 s + IK*)II4 

diverges as j — > oo. This contradicts (54) 



eV 

k=0 



F' H' 



" i>u o 




F 


4>y _ 




H 



as(fc) 



We next show that, whenever the validity conditions of Theorem 3 arc fulfilled, the 
Riccati iterations (11)— (13) , with M — O mxn , initialized from any P(0) = P(0) > 
0, yield the same limit as (54). 

Lemma 2.4-2. Consider the time-invariant LQOR problem (44)~(46) with terminal- 
state cost weight P(0) — P'(0) > 0. Let the plant be stabilizable and detectable. 
Then, the corresponding matrix sequence {-P(j)}^Lo generated by the Riccati iter- 
ations (11)-(13) with M — O mxn , admits, as j — > oo, a unique limit, no matter 
how .P(O) is chosen. Such a limit is the same as the one of (54). Further, 



j- 1 

x'Px= lim min {E flly(fc)ll' y + h(fc)ll'J +b(j)llp ( o)} (2-4-56) 



and the optimal input sequence minimizing the performance index in (56) is given 
by the state-feedback control law u{k) = Fx{k) with F as in (55). 

Proof Since the plant is stabilizable and detectable, Theorem 3 guarantees that, if we adopt the 
control law u°(k) = Fx°(k) , then 

lim \\x°(j)\\ 2 p(0) = 

and 

■!™ f E \\\v°^)\\l v + \u°{k)\\%} +lk°o-)llp ( o)} =*'-p* 

J ~*°° k=0 L J ' 

where the superscript denotes all system variables obtained by using the above control law. As- 
sume now that F is not the steady-state LQOR feedback-gain matrix for some P(0) and initial 
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state x £ R n . Then, 




This contradict steady-state optimality of F for P(0) = O n xn- 

Whenever the Riccati iterations (11)— (13) for the LQOR problem converge as j — > 
oo and P = linij^oo P(j), the limit matrix P satisfies the following algebraic Riccati 
equation (ARE) 



Conversely, all the solutions of (57) need not coincide with a limiting matrix of 
the Riccati iterations for the LQOR problem. The situation again simplifies under 
stabilizability and detectability of the plant. 

Lemma 2.4-3. Consider the time-invariant LQOR problem. Let the plant be sta- 
bilizable and detectable. Then, the ARE (57) has a unique symmetric nonnegative 
definite solution which coincides with the matrix P in (54). 

Proof Assume that, besides P, (57) has a different solution P = P' > 0, P ^ P. If the Riccati 
iterations (11)-(13) arc initialized from P(0) = P, we get P(j) = P, j = 1, 2, • • ■ Then, P and P 
are two different limits of the Riccati iterations. This contradicts Lemma 2. 

Since the ARE is a nonlinear matrix equation, it has many solutions. Among 
these solutions P, the strong solutions are called the ones yielding a feedback-gain 
matrix F = — (tp u + G'PG)~ 1 G'P& for which the closed-loop transition matrix has 
eigenvalues in the closed unit disk. The following result completes Lemma 3 in this 
respect. 

Result 2.4-1. Consider the time-invariant LQOR problem and its associated ARE. 



i. The ARE has a unique strong solution if and only if the plant is stabilizable; 

ii. The strong solution is the only nonnegative definite solution of the ARE if 
and only if the plant is stabilizable and has no undetectable eigenvalue outside 
the closed unit disk. 

The most useful results of steady-state LQR theory are summed up in The- 
orem 5. Its conclusions are reassuming in that, under general conditions, they 
guarantee that the steady-state LQOR exists and stabilizes the plant. One impor- 
tant implication is that steady-state LQR theory provides a tool for systematically 
designing regulators which, while optimizing an engineering significant performance 
index, yield stable closed-loop systems. 

Theorem 2.4-5. Consider the time-invariant LQOR problem (44)~(46) an d the 
related matrix sequence {P{j)}fLo generated via the Riccati iterations (11)-(13) 
with M — O mxn , initialized from any -P(O) = -P'(O) > 0. Then, there exists 



P = $'P$ - $'PG(i^„ + G'PG^G'P® + V: 



'x 



(2.4-57) 



Then: 



P = lim P(J) 



(2.4-58) 
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such that 

x'Px = V^O) (2.4-59) 
= lim min \\\y(k)\\l y + \\u(k)\\%] + M.j)\\ 2 P(0) ) 

and the LQOR control law given by 

u(k) = Fx{k) (2.4-60) 

F = -(ip u + G'PG^G'PQ (2.4-61) 

stabilizes the plant, if and only if the plant (<f>, G, H) is stabilizable and detectable. 
Further, under such conditions, the matrix P in (58) coincides with the unique 
symmetric nonnegative definite solution of the ARE (57). 



Main points of the section The infinite-time or steady-state LQOR solution can 
be used so as to stabilize any time-invariant plant, while optimizing a quadratic 
performance index, provided that the plant is stabilizable and detectable. The 
steady-state LQOR consists of a time-invariant state-feedback whose gain (61) 
is expressed in terms of the limit matrix P (58) of the Riccati iterations (11)- 
(13). This also coincides with the unique symmetric nonnegative definite solution 
of the ARE (57). While stabilizability appears as an obvious intrinsic property 
which cannot be enforced by the designer, on the contrary detectability can be 
guaranteed by a suitable choice of the matrix H or the state weighting matrix ip x 
(47). 

Problem 2.4-4 Show that the zero eigenvalues of the plant, are also eigenvalues of the LQOR 
closed-loop system. 

Problem 2.4-5 {Output Dynamic Compensator as an LQOR) Consider the SISO plant de- 
scribed by the following difference equation 

y(t) + a iy (t - 1) + • • • + a n y(t - n) = b lU (t - 1) + • • • + b n u(t - n) (2.4-62) 

Show that: 

i. x(t) := [ y(t - n + 1) ••• y{t) u(t - n + 1) ••• u(t - 1) ]' (2.4-63) 
is a state-vector for the plant (62); 

ii. The state-space representation ($,G, H) with state x(t) is stabilizable and detectable if 
the polynomials 

B(g) := + • • • + b n / (2 - 4 " 64) 
have a strictly Hurwitz greatest common divisor; 

iii. Under the assumption in ii., the steady-state LQOR, obtained by using the triplet ($, G, H), 
consists of an output dynamic compensator of the form 

u(t) + nu(t — 1) H h r n -iu(t - n + 1) = , . 

<rov(t) + <ny(t - 1) + • • • + <T n - iy (t - n + 1) { ■ " DOj 

[Hint: Use the result [GS84] according to which (&,G) is completely reachable if and only if 
A(q) and B(q) are coprime. ] 

Problem 2.4-6 Consider the LQOR problem for the SISO plant 



1 


G = 


" 


1 3 


1 


2 2 - 





H=[ a 1 ] , a 6 R (2.4-66) 



and the cost ^^L^iy 2 {k) + pu 2 (k)}, p > 0. Find the values of a for which the steady-state 
LQOR problem has no solution yielding an asymptotically stable closed-loop system. [Hint: 
The unobservable eigenvalues of (66) coincide with the common roots of x<s>( z ) '■= det(z/2 — 
and H Adj(z/ 2 - *)G] ] 
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Problem 2.4-7 Consider the plant 

y(t) - ij/(t - 2) = u(t - 1) + it(t - 2) 

with initial conditions 

xi(0) := ij,(t-l) + tt (-l) 
x 2 (0) := 2/(0) 

and state-feedback control law u(t) = — \y(t). Compute the corresponding cost J = "^Z^L^iy 2 (k)+ 
pu 2 (k)], p > 0. [Hint: Use the Lyapunov equation (26) with a suitable choice for x(t). ] 



Problem 2.4-8 Consider the LQOR problem for the plant 



$ = 


[ \ o " 


G = 


9i 




-2 a 
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and performance index J = SfcLoI^ 2 ^) + 10 4 u 2 (k)}. Give detailed answers to the following 
questions. 

i. Find the set of values for the parameters (a, ffi, <?2) for which there exists P = lim.,^^ P(j) 
as in (54). 

ii. Assuming that P as in i. exists, find for which values of (a, gi, 32) the state-feedback control 
law u(k) = — (10~ 4 + G' PG)~ 1 G' P<&x{k) makes the closed-loop system asymptotically 
stable. 



Problem 2.4-9 (LQOR with a Prescribed Degree of Stability) Consider the LQOR problem for 
a plant (<I>, G, H) and performance index 



E 



IwWII 



IM*)lli 



with r > 1, ip y 



ip' > and tpu 



fc=0 

= V4 > 0. Show that: 



(2.4-67) 



i. The above LQOR problem is equivalent to an LQOR problem with the following perfor- 
mance index 



E 

k=0 



\\m\\l v + \\m\\l v 



and a new plant (<!>, G, H) to be specified; 

Provided that ($ , H) is a detectable pair, the eigenvalues A of the characteristic polynomial 
of the closed— loop system consisting of the initial plant optimally regulated according to 
(67) satisfy the inequality 

w<- 



Problem 2.4-10 (Tracking as a Regulation Problem) Consider a detectable plant 
(&,G,H) with input u(t), state x(t) and scalar output y(t). Let r be any real number. De- 
fine e(t) := y(t) — r. Prove that, if 1 is an eigenvalue of viz. X*(l) := det(7 n — <&) = 0, there 
exist eigenvectors x r of $ associated with the eigenvalue 1 such that, for x(t) := x(t) — x r , wc 
have 

x(t + l) = *x(t) + Gu(t) 1 r2 , fi ^ 
e(t) = Hx(t) } (2 - 4 " 68) 

This shows that, under the stated assumptions, the plant with input u(t), state x(t) and output 
e(t) has a description coinciding with the initial triplet (Q,G,H). Then, if (<3?,G) is stabilizablc, 
the LQ regulation law u(t) = Fx(t) minimizing 

00 

I> a (*) + K*)lftj. ^>0 (2.4-69) 
k=0 



for the plant (62), exists and the corresponding closed-loop system is asymptotically stable. 
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Problem 2.4-11 (Tracking as a Regulation Problem) Consider again the situation described in 
Problem 2.4-10 where u(t) £ R. Let 



Sx(t) 
8u(t) 



x(t) - x(t - 1) 
u{t) - u(t - 1) 
[ 5x'(t)e(t) ]' e R n+1 



(2.4-70) 



i. Show that the state-space representation of the plant with input Su(t), state £(t) and 
output e(t) is given by the triplet 

n 1 r r* 

Q' n 1 

ii. Let © be the observability matrix of E. Show that by taking elementary row operations 
on ©, we can get a matrix which can be factorized as follows 





$ 




G 


( 


H<S> 1 




HG 



G 



* On 

OL 1 








1 _ 




H 





© = 


H<t> 







HQ™- 1 


_ 



Show that (<I>, H) detectable implies detectability of E. 
iii. Let R the reachability matrix of E. Define 



R : 



In O n 

-H 1 



R 



Show that by taking elementary column operations on R, we can get a matrix which can 
be factorized as LR, with 



G *-/„ 
H 



R ■■ 



10 
G <£>G 







iv. Prove that nonsingularity of L is equivalent to H yu (l) := H(I„ — $) — *G ^ 

v. Prove that (<&, G) stabilizable and H yu (l) ^ implies that E is stabilizablc. 

vi. Conclude that if G, ff) is stabilizable and detectable, and H yu (l) ^ 0, the LQ regulation 



law 

minimizing 



5u{t) = F x 5x(t) + F e e(i) 

OO 

J2[s 2 (k) + p[Su(k)] 2 ] , p>0 



(2.4-71) 
(2.4-72) 



for the plant E, exists and the corresponding closed-loop system is asymptotically stable. 
Note that (71) gives 

i 

u(t) - u(0) = (2.4-73) 



= F x [x(t)-x(0)]+F £ Y / ^( t ) 
k=i 

In other terms, (71) is a feedback-control law including an integral action from the tracking error. 

Problem 2.4-12 (Fake ARE) Consider the Riccati forward difference equation (11) with M = 
Omxn and tp x as in (46) and (47): 

P(j + 1) = Wp- (2.4-74) 

*'P(j)G[^„ + G'PCjJGT^'PCj)* + H'jyH 

We note that the above equation can be formally rewritten as follows 

P(j) = *'P(J')* " 

*'P(j)G[</>„ + G'P0-)G] _1 G'P(j)* + Q(j) (2.4-75) 
Q(j) := H'i, y H + P(j)- P(j + 1) (2.4-76) 
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The latter has the same form as the ARE (57) and has been called [BGW90] Fake ARE. Make 
use of Theorem 4 to show that the feedback-gain matrix 

F(j + 1) = -ty u + G'PU)G]- 1 G'P(j)<l> (2.4-77) 

stabilizes the plant, viz. $ + GF(j + 1) is a stability matrix, provided that ($, G, H) is stabilizablc 
and detectable, and P has the property 

P(j) ~ P(j + 1) > (2.4-78) 

[Hint: Show that (78) implies that Q(j) can be written as H'i/j y H + F'tp-yF, tjjy = tj)' > 0, 
T S R rxn , r := rank[P(j) — P(j+1)]. Next, prove that detectability of (<3>, H) implies detectability 
of [ H' V ]'). Finally, consider the Fake ARE. ] 



2.5 Steady— State LQR Computation 

There arc several numerical procedures available for computing the matrix P in 
(4-58). We limit our discussion to the ones that will be used in this text. In partic- 
ular, we shall not enter here into numerical factorization techniques for solving LQ 
problems which will be touched upon for the dual estimation problem in Sect. 6.5. 

Riccati Iterations 

Eqs. (4-ll)-(4-13), with M = O mx n , can be iterated, once they are initialized from 
any P(0) = P'(0) > 0, for computing P as in (4-58). Of the three different forms, 
the third, viz. 

P(j + 1) = [* + GF(j + l)]'P(j) [* + GF(j + 1)] + 

F'(j + l)ip u F(j + l)+ip x (2.5-1) 
F(j + 1) = -{iPu + G'Pim^G'Pim (2.5-2) 

is referred to as the robustifted form of the Riccati iterations. The attribute here 
is motivated by the fact that, unlike the other two remaining forms, it updates 
the matrix P(j) by adding symmetric nonnegative definite matrices. When com- 
putations with round-off errors are considered, this is a feature that can help to 
obtain at each iteration step a new symmetric nonnegative definite matrix P(j), as 
required by LQR theory. 

The rate of convergence of the Riccati iterations is generally not very rapid, 
even in the neighborhood of the steady-state solution P. The numerical procedure 
described next exhibits fast convergence in the vicinity of P. 

Kleinman Iterations 

Given a stabilizing feedback-gain matrix Fk € R mx ", let Ck be the solution of the 
Lyapunov equation 

L k - & k C k * k + F' k ^ u F k + i> x (2.5-3) 
:= $ + GF k (2.5-4) 

The next feedback-gain matrix F k+ \ is then computed 

F k+1 = -(V„ + G'C k G)- 1 G'C k & (2.5-5) 

The iterative equations (3)-(5), k — 0, 1, 2, • • • , enjoy the following properties. 



30 Deterministic LQ Regulation - I Riccati-Bascd Solution 

Suppose that the ARE (4-57) has a unique nonncgative definite solution, e.g. 
(<fr,G,H), with tjj x = H'ipyH, ip y = ip' y > 0, stabilizable and detectable. Then, 
provided that Fq is such as to make <&o a stability matrix, 

i. the sequence {£k}kL * s monotonic nonincreasing and lowerbounded by the 
solution P of the ARE (4-57) 

C > ■ ■ ■ > C k > C k +i > P; (2.5-6) 

ii. lim C k = P; (2.5-7) 

k^oo 

iii. the rate of convergence to P is quadratic, viz. 

\\P - Ck+i\\ < c\\P - C k \\ 2 (2.5-8) 

for any matrix norm and for a constant c independent of the iteration index 
k. 

Eq. (8) shows that the rate of convergence of the Kleinman iterations is fast in 
the vicinity of P. It is however required that the iterations be initialized from 
a stabilizing feedback-gain matrix F . In order to speed up convergence, [AL84] 
suggests to select Fo via a direct Schur-type method. 

The main problem with Kleinman iterations is that (3) must be solved at each 
iteration step. Although (3) is linear in Ck, its solution cannot be obtained by 
simple matrix inversion. Actually, the numerical effort for solving it may be rather 
formidable since the number of linear equations that must be solved at each itera- 
tion step equals n(n + l)/2 if n denotes the plant order. 

Kleinman iterations result from using the Newton-Raphson's method [Lue69] 
for solving the ARE (4-57). 

Problem 2.5-1 Consider the matrix function 

N(P) := -P + <S>'P<S>- PGlHiP^G' P$ + tp x 

where 

H(P) := (f u + G'PG) 
The aim is to find the symmetric nonncgative definite matrix P such that 

N(P) = CW 

Let £fc_i = C k _ 1 > be a given approximation to P. It is asked to find a next approximation 
Ck by increasing Cu-i by a "small" correction C 

Ck = C-k-i + £ 

C, and hence Ck, has to be determined in such a way that N(Ck) ~ O n xn- 
By omitting the terms in C of order higher than the first, show that 

H-\C k ) » H-HCk-!) - H-HCk-OG'CGH-HCk-!) 

and, further, that N(Ck) ~ O n xn if £fc satisfies (3)-(4). 

Control— theoretic interpretation It is of interest for its possible use in adaptive 
control, to give a specific control theoretic interpretation to the Kleinman itera- 
tions. To this end, consider the quadratic cost J(x, U[ 0iOO )) under the assumption 
that all inputs, except u(0), are given by feeding back the current plant state by a 
stabilizing constant gain matrix Fk, viz. 



u{j)=F k x{j), j=l,2,- 



(2.5-9) 
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(0,x) 



t = 0+ 



«(0) 



y(t) 

x(t) 



F k 



Figure 2.5-1: A control-theoretic interpretation of Kleinman iterations. 



The situation is depicted in Fig. 1 where t = + indicates that the switch commutes 
from position a to position b after u(0) has been applied and before u(l) is fed into 
the plant. 

Let the corresponding cost be denoted as follows 

j(x,u{0),F k y.= j(x,u [0tOo) \u{j) = F k x{j), j=l,2,---) (2.5-10) 

We show that, for given x and Fk, 

u(0) = F k+1 x. (2.5-11) 

minimizes (10) w.r.t. u(0), if F k +i is related to Fk via (3)-(6). To see this, rewrite 
(10) as follows 

oo 

j(x,u(o),F k ) = Ml x + \H0)\\l u + J2\\ x (M 2 ^ x+ F^ u F k ) 

3 = 1 

= INIl + IK0)Hl + lk(i)||i fe 

= INIl + IK0)|ll + ||^ + G U (0)||| fc (2.5-12) 

where the first equality follows from (9), the second from Problem 4-1 if Ck is the 
solution of the Lyapunov equation (3), and the third since x(l) = + Gu(0). 
Minimization of (10) w.r.t. u(0) yields (11) with F k +\ as in (5). 

Problem 2.5-2 Consider (12) and define the symmetric nonnegative definite matrix R k +\ 
implicitly via 

x'R k+1 x := min J(x, u(0), F k ) (2.5-13) 

u(0) 

Show that Rk+i satisfies the recursions 

R k+1 = - &C k G{ip u + G'C k G)- l G'C k 4> + 4> x (2.5-14) 

with C k as in (3). 

Problem 2.5-3 If R k +i and C k are as in Problem 2, show that 

C k - -R fc+ i > (2.5-15) 

Problem 2.5-4 Assume that ip x = H'ip y H , ip' y = ip y > and (&,G, H) is stabilizablc 
and detectable. Use (14) and (15) to prove that (5) is a stabilizing feedback-gain matrix, viz. 

= $ + GF k +i is a stability matrix. [Hint: Refer to Problem 4-12. From (14) form a fake 
ARE C k = $'£ fe 3> - <J>'£ fc G(V u + G'CkG^G'Ck® + Q k . Etc.] 
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2.6 Cheap Control 



The performance index used in the LQR problem has to be regarded as a compro- 
mise between two conflicting objectives: to obtain a good regulation performance, 
viz. small ||y(fc)||, as well as to prevent ||u(fc)|| from becoming too large. This com- 
promise is achieved by selecting suitable values for the weights Vv^Az an d M in 
the performance index. It is however interesting to consider in the time-invariant 
case a performance index in which 

ipu = O mxm M = O mxn ip x (N) = O nxn 

This means that the plant input is allowed to take on even very large values, the 
control effort not being penalized in the resulting performance index 

N-l N-l 

J{x,u [0tN) ) = ]T \\x(k)\\l x = ]T \\y{k)\\l v (2.6-1) 

k=0 k=0 

This choice should hopefully yield a high regulation performance though at the 
expense of possibly large inputs. The LQR problem with performance index (1) 
will be referred to as the Cheap Control problem and, whenever it exists, the cor- 
responding optimal input for TV — > oo as Cheap Control. 

It is to be noticed that, since in (1) tp u = O mxm , we have to check that for 
solving the Cheap Control problem one can still use (4-9)-(4-15) where, on the 
opposite, it was assumed that ip u > 0. As can be seen from the proof of Theorem 
3-1, for any finite N (4-9)-(4-15) hold true even for tp u — O mxm , provided that 
G'P(j)G is nonsingular. However, Cheap Control is not comprised in the asymp- 
totic theory of Sect. 2.4 which is crucially based on the assumption that tp u > 0. In 
particular, no stability property can be insured by Theorem 4-4 to a Cheap Control 
regulated plant. Indeed, as we shall now show, the regulation law that for N — > oo 
minimizes (1) does not yield in general an asymptotically stable regulated system. 

In order to hold this issue in focus, we avoid needless complications by restricting 
ourselves to SISO plants, viz. m = p = 1. Thus, w.l.o.g. we can set tp y = 1: 

N-l 

J(x,u [QtN) ) = y 2 ( k ) 

k=0 

(2.6-2) 

N-2 

= E y 2 (k) + MN-i)\\ \, H 

k=0 

We also assume that the first sample of the impulse response of the plant (4-44) is 
nonzero 

tui :=HG^0 (2.6-3) 

We shall refer to this condition by saying that the plant has unit I/O delay. Then, 
we can solve the Riccati difference equation (4-11), initialized from P(0) = tp x (N) — 
O nxn , or, according to the second of (2.6-2) from P(l) = ip x (N — 1) = H'H, to 
find 

Then, it follows that for every j = 1, 2, • • • 



P(j) = H'H 



(2.6-4) 
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F(j) = -(HG)^ 1 H<& 



Correspondingly, 



3-1 

= min y 2 (k) 

= x'H'Hx = y 2 (0) 



(2.6-5) 



(2.6-6) 



This result shows that, whenever w\ ^ 0, the constant feedback row-vector (5) is 
such that the corresponding time-invariant Cheap Control regulator 



u(k) = x(k) , k = 0, 1, 



(2.6-7) 



takes the plant output to zero at time k = 1 and holds it at zero thereafter. In 
fact, by (7) 

x(k + l) = <&x(k) + Gu(k) 

= - G x(k) 

Wi 



Hence, 



y(k + l) = Hx(k + l) 



Wi 



= H^>x(k) -H<f>x(k) = 

Wi 

In order to find out conditions under which the Cheap Control regulated system 
is asymptotically stable, the plant is assumed to be stabilizablc. Hence, since its 
unreachable modes are stable and are left unmodified by the input, w.l.o.g. we 
can restrict ourselves to a completely reachable plant in a canonical reachability 
representation: 



H = [ ] (2.6- 



It is known that the transfer function H yu (z) from the input u to the output y of 
the system (8) equals 





r 1 

| In-1 
1 









G = 






. — a n -i ■ • • — ai 




. 1 . 



H vu {z) :=H{zI n -$)- 1 G = 



B{z) 
A(z) 



where 



= z n + a lZ n 1 + --- + z n 1 + --- + a n 



Further, 



A(z) := det(zl n - $) 
+ a- 

B{z) := b x z n - Y + ■ ■ ■ + b n 

VH := HG = &i ^ 



(2.6-9) 



(2.6-10) 
(2.6-11) 
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GH<$> 

$ cJ := $ 

Wl 



Problem 2.6-1 Show that the closed-loop state-transition matrix <1> C ; of the unit I/O delay 
plant (8) regulated by the Cheap Control (7) equals 

I 1 
I n -1 

(2.6-12) 

| -~6„/bi 62/61 . 

Being <I> C ; in companion form, its characteristic polynomial can be obtained by 
inspection 

Xd(z) := &et{zl n -$ cl ) (2.6-13) 

bi 61 
= }-(b 1 z n + b 2 z n ~ 1 + --- + b n z) 

We say that H yu (z) or ($, G, H ) is a minimum-phase transfer function, or plant, 
if the numerator polynomial B{z) in (9) is strictly Hurwitz, viz. it has no root in 
the complement of the open unit disc of the complex plane. Further, the control 
law u(k) — Fx(k) is said to stabilize the plant (<!>, G, H) if the closed-loop state 
transition matrix <f> c j := $ + GF is a stability matrix, i.e. Xci(z) is strictly Hurwitz. 
Since the polynomial in (13) is strictly Hurwitz if and only if B(z) is such, we arrive 
at the following conclusion which is a generalization ( Cf. Problem 2) of the above 
analysis. 

Theorem 2.6-1. Let the plant (<&, G, H) be time-invariant, SISO and stabilizable. 
Then, the state-feedback regulator solving the Cheap Control problem yields an 
asymptotically stable closed-loop system if and only if the plant is minimum-phase. 
If the plant has I/O delay t, 1 < r < n, 

bi = b 2 = ■ ■ ■ = &r-i = 

w T := H^^G = b T /0 

the Cheap Control law is given by 

u{k) = —x{k), fc = 0,l,--- (2.6-14) 

o T 

Further, provided that the plant is completely reachable, (14) yields a closed-loop 
characteristic polynomial given by 

Xcl (z) = ^z T B(z) (2.6-15) 

b T 

Finally the Cheap Control law (14) is output-deadbeat in that, for any initial plant 
state x(0) = xe IT, 

y(k) = 0, k = T,r + l,--- 
Correspondingly, for every j > t, 

j-i T-1 

Vj(x) = min^y 2 (fc) = ^y 2 (fc) 

T-1 



x''%2(&) k H'H§ k x 



k=0 
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Problem 2.6-2 Consider the plant (8) with I/O delay r, 1 < r < n. Show that, similarly to 
(4), 

3-1 
k=0 

= P(t) , i > r 
Next, using (4-10), find (14). Finally, verify (15). 

Naively, one might think that the Cheap Control regulator is obtained by letting 
ipu I Omxrn hi the regulator solving the steady-state LQOR problem. Indeed Prob- 
lem 5 shows that this is the case for minimum-phase SISO plants. However, for 
nonminimum-phase SISO plants this is generally not the case (Cf. Problem 6). In 
fact, in contrast with Cheap Control, the solution of the steady-state LQOR prob- 
lem, provided that the plant is stabilizable and detectable, yields an asymptotically 
stable closed loop system even for vanishingly small tp u > [KS72] (Cf. Problem 
6). 

Main points of the section Cheap Control regulation is obtained by setting ip u = 
O m xm in the performance index of the LQOR problem. For SISO time-invariant 
stabilizable plants, Cheap Control regulation is achieved by a time-invariant state- 
feedback control law which is well-defined for each regulation horizon greater than 
or equal to the I/O delay of the plant. In this case, the Cheap Control law can be 
computed in a simple way (Cf. (14)). In particular, in contrast with the ip u > 
case, no Riccati-like equation has to be solved. However, applicability of Cheap 
Control is severely limited by the fact that it yields an unstable closed loop system 
whenever the plant is nonminimum-phase. 

Problem 2.6-3 Consider the polynomial B(z) in (11) with b\ ^ 0. Show that |6„/bi| > 1 
implies that B(z) is not a strictly Hurwitz polynomial. [Hint: If rt, i = 1, • • • , n — 1, denote the 
roots of B(z), B(z)/b 1 = U^(z - r;).] 

Problem 2.6-4 Show that L := P(t) of Problem 2 satisfies the following matrix equation 

L = <S>'L<i> + H'H - (<I>') T H'H(<S>) T 

Show that, provided that <I> is a stability matrix, the equation above becomes as r — > oo the 
Lyapunov equation (4-26) with ipx = H'H. 

Problem 2.6-5 Consider the following SISO completely reachable and observable minimum- 
phase plant 



<1> : 



1 




G= U ± H = [ 1 2 ] 

and the related steady-state LQOR problem with performance index 

oo 

J(x,u[ 0tOO )) = [y 2 ( k ) + p« 2 ( fc )] 

fc=0 

with p > 0. Show that: 

i. The corresponding ARE (4-37) has nonnegative definite solution 

1 2 



P(P) 



2 P(p) 



p(p) := _H + -(p^ + 10p + 9) 



ii. The steady-state LQOR row— vector feedback (4-40) equals 

Hp) = -(p + vip))' 1 [02] 



3G 
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iii. The Cheap Control row-vector feedback (5) equals 

f = [o -h ] 

and the corresponding strictly Hurwitz closed-loop characteristic polynomial is 

Xd{z) = z 2 + 

iv. F(p) -•• F, as p I 0. 

Problem 2.6-6 Consider the SISO plant (Q,G,H) with <E> and G as in Problem 5 and 

H = [ 2 2 ] 

Show that: 

i. The plant is nonminimum-phase 

ii. P(p) and F(p) associated to the same performance index as in Problem 5 are given again 
as in i. and ii. of Problem 5. 

iii. The Cheap Control row— vector feedback (5) equals 

F = [ -2 ] 

and yields the non Hurwitz closed-loop characteristic polynomial 

X c i(z) = z 2 +2z 

iv. F := hm F(p) = [ -§ ] + F 

yields a strictly Hurwitz closed-loop characteristic polynomial whose roots are given by 
the stable roots and the reciprocal of the unstable roots of Xci( 2 ) m iii- 



2.7 Single Step Regulation 

Assume that the time-invariant plant (4-44) has I/O delay r, 1 < t < n, viz. its 
impulse response sample matrices Wk ■— H$ k ~ 1 G are such that 

W k = O pxmi fc=l,2,---,r-l (2.7-1) 

and 

W T + O pxm (2.7-2) 
Then, consider the performance index with ip u > 

J(x(0), U[o , T) ) - J2 [\\y(k)\\l y + \\u(k)\\l u ] + \\y(r)\\l y (2.7-3) 

fe=0 

Because of the I/O delay r, U[o,t) is n °t affected by M[o, T )- In fact, 

v(k)-( H * kx{0) ' * = °.-".' r - 1 (27-4) 

Therefore the optimal u® T ^ minimizing (3) is given by 

»°<H£ (0) : l:!.-.r-i (2 - 7 - 5 > 

F^-^ + VK;^^]" 1 ^;^^ (2.7-6) 

Problem 2.7-1 Verify (5)-(6) by using (4-9)-(4-ll). 
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It is to be pointed out that, relatively to the determination of the only nonzero 
input m°(0) in the optimal sequence u® T y (3) can be replaced by the following 
performance index comprising a single regulation step 

J(x(0),u(0)):=\\y(r)\\l y + \\u(0)\\l u (2.7-7) 

Problem 2.7-2 Verify (6) by direct minimization of (7) w.r.t. u(0). 

It follows that the time-invariant state-feedback control law 

u(k) = Fx{k) (2.7-8) 

with F as in (6), minimizes for each k e TL the single step performance index 

J(x(k),u(k)) :=|b(fc + T)||^ + ||u(fc)||^ (2.7-9) 

for the plant (4-44) with I/O delay r. This will be referred to as Single Step 
Regulation. 

As in Cheap Control, the feedback-gain of the Single Step Regulator can be 
computed without solving any Riccati-like equation. Similarly to Cheap Control, 
this has negative consequences for the stability of the closed-loop system. In order 
to find out the intrinsic limitations of Single Step regulated systems, it suffices to 
consider a SISO plant. We assume also that the plant is stabilizable. Using the 
same argument as for Cheap Control, we can restrict ourselves to a completely 
reachable plant (&,G,H) with I/O delay r, $ and G as in (6-8), and 

H=[b n ■■■ b T • • • ] (2.7-10) 

being 

Wt = b T 7^ 

In this case, for ip y — 1 and ip u = p > 0, (6) becomes 



P + bl 

br 



H$ T (2.7-11) 
[ • • • b n ■ ■ ■ b T ] $ 



p + bl 

Problem 2.7-3 Verify that, for H as in (10) and $ as in (6-8), 

H^- 1 = [ - b„---b T ] . 

Problem 2.7-4 Show that the closed-loop state-transition matrix <E> C ; of the plant (<1>, G, H) 
with I/O delay r, <J> and G as in (6-8), and H as in (10), is again in companion form with its last 
row as follows 

(I-7&1-) [ -a n ■■■ - a n _ T+ i -a„_ T ••• - ai ] +7 [ ••• -b n ■■■ - 6 T +i ] (2.7-12) 
with 7 :=b T /(p + bl) 

Being <f> c ; in companion form its characteristic polynomial can be obtained by 
inspection 



^-A(z) + z T B(z) 



(2.7-13) 



Xd(z) = 7 
with A(z) as in (6-10) and 

B(z) := b T z n - T + ■■■ + &„ (2.7-14) 
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Theorem 2.7-1. Let the plant (<&, G,H) be time-invariant, SISO, stabilizable and 
with I/O delay r. Then, the Single Step Regulator u(k) — Fx{k) with feedback-gain 
(11) yields a closed-loop system with characteristic polynomial (13). 

Stability of a SISO Single Step regulated plant can be investigated by the root- 
locus method. The eigenvalues of the closed-loop system are close to the roots of 
z T B{z) if p is small, and close to the roots of A(z) if p is large. If the plant is 
minimum-phase, there is an upper bound on p, call it pu, such that for p < pu 
the closed-loop system is stable. If the plant is open-loop stable, there is a lower 
bound on p, call it p m , such that for p > p m the closed-loop system is stable. If 
the plant is either nonminimum-phase or open-loop unstable there are, however, 
critical control weights which yield an unstable closed-loop system. If the plant 
is both nonminimum-phase and open-loop unstable, there may be no value of p 
which makes the Single Step regulated plant asymptotically stable. 

Main points of the section The Single Step Regulation law for a SISO plant can 
be easily computed (Cf. (11)). The price paid for it is that applicability of Single 
Step Regulation is limited by the fact that it yields an asymptotically stable closed- 
loop system only under restrictive assumptions on the plant and the input weight 
p. In particular, there may be no Single Step Regulator capable of stabilizing a 
nonminimum-phase open-loop unstable plant. 

Problem 2.7-5 Consider the open-loop unstable nonminimum-phase plant 



Compute the Single Step feedback-gain row— vector (11). Show that the closed-loop eigenvalues 
of the Single Step Regulated systems equal 1 ± y/1 + p. Conclude that there is no p, p £ [0, oo), 
yielding a stable closed-loop system. 



Notes and References 

LQR is a topic widely and thoroughly discussed in standard textbooks: 
[AF66], [AM71], [AM90], [BH75], [DV85], [KS72], [Lew86]. 

Dynamic Programming was introduced by Bellman [Bel57] . More recent texts 
include [Ber76] and [Whi81]. 

The role of the Riccati equation in the LQR problem was emphasized by Kalman 
[Kal60a]. See also [Wil71]. Strong solutions of the ARE are discussed in [CGS84] 
and [dSGG86]. The literature on the Riccati equation is now immense, e.g. [Ath71] 
and [Bit89]. 

Numerical factorization techniques for solving LQ problems were addressed by 
[Bie77] and [LH74]. See also [KHB+85] and [Pet86]. Kleinman iterations for solving 
the ARE were analysed in [Kle68], [McC69], and for the discrete-time case in 
[Hcw71]. The control-theoretic interpretation of Kleinman iterations depicted in 
Fig. 5-1 appeared in [MM80]. 

Cheap Control is the deterministic state-space version of the Minimum- Variance 
Regulation of stochastic control [Ast70]. Similarly, in a deterministic state-space 
framework Single Step Regulation corresponds to the Generalized Minimum Variance 
Regulation discussed in [CG75], [CG79], and [WR79]. 



CHAPTER 3 



I/O DESCRIPTIONS AND 
FEEDBACK SYSTEMS 



This chapter introduces representations for signals and linear time-invariant dy- 
namic systems that are alternative to the state-space ones. These representations 
basically consist of system transfer matrices and matrix fraction descriptions. They 
will be generically referred to as I/O or "external" descriptions in contrast with the 
state-space or "internal" descriptions. The experience has led us to appreciate 
one's advantage of being able to use both kinds of descriptions in a cooperative 
fashion, and exploit their relative merits. 

The chapter is organized as follows. In Sect. 1 we introduce the ^-representation 
of a sequence and matrix-fraction descriptions of system transfer matrices. Sect. 2 
shows how stability of a feedback linear system can be studied by using matrix- 
fraction descriptions of the plant and the compensator. These tools allow us to 
characterize all feedback compensators, in a suitably parameterized form, which 
stabilize a given plant. The issue of robust stability is addressed in Sect. 3. After 
introducing in Sect. 4 system polynomial representations in the unit backward 
operator, Sect. 5 discusses how the asymptotic tracking problem can be formulated 
as a stability problem of a feedback system. 

3.1 Sequences and Matrix Fraction Descriptions 

Consider a time-indexed matrix-valued sequence 

«(■) - {«(fc)}r=-co (3.1-1) 

= {■■■,u(-l) ; MO), Ml),---} 

where: u(k) € R pxm ; k € and the semicolon separates the samples at nega- 
tive times on the left from the ones at nonnegative times on the right. Another 
possibility is to write 

oo 

u(d) := (S- 1 " 2 ) 

k— — oo 

This has to be interpreted as a representation of the given sequence where the 
symbol d k , the fcth power of d, indicates that the associated matrix u(k) is the 
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value taken on by u(-) at the integer k along the time-axis. E.g., for the real- 
valued sequence 

u(-) = {-l; 1,2,-3} (3.1-3) 

we have 

u{d) = -d- 1 + I + 2d - 3d 2 (3.1-4) 

In u(d) the powers of d are instrumental for identifying the positions of the numbers 
-1,1,2,-3 along the time-axis. From this viewpoint, d is an indeterminate and no 
numerical value, either real or complex, pertains to it. It is only a time-marker. In 
particular, the power series (2) is a formal series in that it is not to be interpreted 
as a function of d, and there is no question of convergence whatsoever. We shall 
refer to u(d) as the d-representation of u(-). 
Consider now 

u_i(-):=«(--l) (3.1-5) 
a copy of the sequence u(-) delayed by one step. We have 

oo 

fi_i(d) = J2 u ( k - = dil (d) (3-1-6) 

k— — oo 

We see then that d applied to u(d) yields the (i-representation of the sequence u(-) 
delayed by one-step. 

Consider next the sequence v(-) obtained by convolving w(-) with u(-) 

oo 

v ( k ) = Y w (i) u ( k -i) (3-1-7) 

i— — oo 

oo 

= w(k — r)u(r) 

r— — oo 

we have 

oo oo 

v(d) = w{i)u{k-i)d k (3.1-8) 

k— — oo i— — oo 
CO oo 

= Yl w « E <k-i)d k 



— -co 

CO 



= ™(*) dt u(d) [(6)] 

i— — oo 

= w(d)u(d) 

We then see that the ^-representation of the convolution of two sequences w(-) and 
m(-) is the product of the (i-representations of w(-) and u(-). 

We insist again on pointing out that the operations under (8) are formal. In 
particular, the two infinite summations in (8) always commute as long as (7) makes 
sense. 

Given u(d) we define as its adjoint 



u*(d) —u'id- 1 ) 



(3.1-9) 
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Then, u*(d) is the d-representation of a sequence obtained by taking the transpose 
of each sample of u(-) and reversing the time-axis. We say that u(d) has order I 
whenever £ is the minimum d-power present in (2). In such a case, we shall write 

or&u(d)=£ (3.1-10) 

E.g., the order of u(d) in (4) equals — 1. Since u(-) and u(d) identify the same 
entity, in the sequel, for the sake of conciseness, we shall simply refer to u(d) as a 
sequence, whenever no ambiguity can arise. 

We say that u(d) is a causal sequence if ord u(d) > 0, strictly causal if ord u(d) > 
0. u(d) is called anticausal (strictly anticausal) if u*(d) is causal (strictly causal). 
A sequence is called one-sided if it is either causal or anticausal. Otherwise, it is 
called two-sided. 

We write u(-) <G £2, or equivalently ii(d) € £2, whenever the corresponding 
sequence has finite energy, viz. 

00 

||u(-)|| 2 := Tr u'{k)u{k) (3.1-11) 

k— — 00 
00 

= Tr u(k)u'(k) < 00 

k— — oo 

The above quantity can be also computed as follows 

\\u{d)\\ 2 :=Tr (u* (d)u(d)) (3.1-12) 

where the symbol () denotes extraction of the 0-power term. E.g., (—d^ 1 + 2 + 
d — 3d 2 ) = 2. It is easy to verify that 

|K.)H 2 = II^)I| 2 (3-1-13) 

whenever u(-) G £2- 

Consider now temporarily the series (1) as a numerical series, viz. as a function 
of d e (D, (D denoting the field of complex numbers. Assume that the series converges 
for d in some subset V of(D and its sum can be written in a closed form S(d), viz. 

00 

S(d) = Y u ( k ) dk > deVcGl (3.1-14) 

k— — 00 

In such a case we shall equal the formal series (2) to S(d) 

u(d) = S(d) (3.1-15) 
and S(d) in (15) will be called the formal sum of (1). 

Example 3.1-1 For any square matrix $ 6 TR nxn , consider the causal sequence 

u(.) = {u(fc) = * fc >r=o = {; In, *, * 2 , ■ ' '} (3.1-16) 

Then, 

oc 

u(d) = (® d ) k = ~ d^y 1 = S(d) (3.1-17) 

k=0 

In fact, (I n — is the sum of the numerical series X^fcLo^ 1 ^) ^ or ever y complex number d 

such that \d\ < l/|A max («I>)|, where A max (<I , ) is the eigenvalue of $ with maximum modulus. 
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In accordance with interpreting d as an indeterminate, we point out that (15) is 
formal and there is no question of convergence of u(d) to S(d) whatsoever. 

Another point that has to be brought out is the following. Given a two-sided 
sequence, its formal sum, whenever it exists, is well defined. Conversely, given a 
formal sum S(d), in general the corresponding series u(d) can be unambiguously 
identified only by specifying its order, e.g. whether u(d) is either causal or anti- 
causal. 

Example 3.1-2 Consider the formal sum S(d) = (I n — d<3?) — 1 found in Example 1 for the 
sequence (16) with ordit(d) = 0. Assuming $ nonsingular, we can write formally 

S(d) = (I n - d*)- 1 

= -(d^iln-d- 1 ®- 1 )- 1 (3.1-18) 
= -{^(T 1 +<S>- 2 d-' 2 +•••} 

Then, S(d) is also the formal sum of the above series, whose order is — oo, and corresponds to the 
strictly anticausal sequence 

v(-) = { ••■ , -*" 2 ,-*- 1 ;} (3.1-19) 
In fact, (I n — d<I>) -1 is the sum of the numerical series — ^= _ oo (*<0 k f° r every complex number 
d such that \d\ > l/|A m i n («I>)|, where A m i n (<E>) is the eigenvalue of $ with minimum modulus. 

Problem 3.1-1 Convince yourself that there is no formal sum for the d— representations of the 
two-sided sequences 

z(.) := {... ) _$- 2 ) _$- 1 ; 7„,$,$ 2 ,...} (3.1-20) 
h(-) := {■■■ ,$- 2 ,<i>- 1 ; I n ,<S>,<i> 2 ,-- }. (3.1-21) 

In this book, it will be made always clear from the context whether a formal 
sum corresponds to either a causal or anticausal matrix-sequence. The following 
example consolidates the point. 

Example 3.1-3 Consider the linear time— invariant state-space representation 

{ Xik ^ = S?)^ *** = 0,1,2,.. (3,-22) 

We have 

CO 

x(k) = $ k x(0) + ^2 g(i)u(k - 1) (3.1-23) 

i= — oo 

where «(•) = {; u(0), u(l), ■ ■ ■} is a causal input sequence and 

C S^G , i>l 
g(i) := I (3.1-24) 
\ Onxm , elsewhere 

is the j-th sample of the impulse-response matrix of (3>,G, I n ). Since by Example 1 and (6) 

g(d) = (I n - d*) _1 dG (3.1-25) 

using (8) we find 

x(d) = {I n - d*) _1 [x(0) + dGu{d)\. (3.1-26) 

Here, since x(d) and u(d) are causal sequences and the system (22) dynamic, (I„ — d$) _1 must 
be interpreted as the formal sum of the causal matrix-sequence of Example 1. Further, 

y{d) = H(I n - (M-rVo) + H yu (d)u(d) (3.1-27) 

where the rational matrix 

H yu (d) := H(I n - d^Y x dG (3.1-28) 

is the transfer matrix of the system (22). This must be regarded as the formal sum of the d- 
rcprcscntation of the sequence of the samples of the impulse response matrix of the system (22): 

hyu(-) ■= {; O p xm, HG, H&G, ■■■}. (3.1-29) 
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We say that £ = ($, G, H) is free of hidden modes if it is completely reachable and 
completely observable. £ is free of nonzero hidden eigenvalues if it is completely 
controllable and completely reconstructiblc. In Appendix B, it is shown that £ is 
completely controllable if and only if the polynomial matrices (/„ — d<&) and dG are 
left coprime, and completely reconstructible if and only if (/„ — d<I>) and H are right 
coprime. £ will be said to be free of unstable hidden modes if it is stabilizable and 
detectable. In Appendix B, it is shown that a necessary and sufficient condition for 
£ to be stabilizable is that the greatest common left divisors (geld) A(c?) of /„ — d$ 
and dG are strictly Hurwitz, viz. 



detA(d)^0 , V|d|<l (3.1-30) 

It is also shown that a necessary and sufficient condition for £ to be detectable 
is that the greatest common right divisors (gcrd) of /„ — d<I> and H are strictly 
Hurwitz. 

H yu (d) in (28) can be represented in terms of matrix-fraction descriptions 
(MFDs) 

H yu {d) = A^ 1 (d)B 1 (d) (3.1-31) 
= B 2 {d)A^{d) (3.1-32) 

where: Ai(d) and B\{d) are polynomial matrices of dimensions p x p and, respec- 
tively, p x m; A2^d) and i?2(rf) polynomial matrices of dimensions m x m and, 
respectively, pxm. A^ 1 (d)Bi(d) is called a left MFD. Further, this is said to be an 
irreducible left MFD, whenever Ai(d) and B\{d) are left coprime. Mutatis mutandis 
a similar terminology is used for the right MFD ^(d)^ 1 ^). For an irreducible 
left MFD A^[ 1 (d)Bi(d) to represent the transfer matrix of a strictly causal dynamic 
system (22) it is necessary and sufficient that: 

i. Ai(0) is nonsingular (3.1-33) 

ii. d\B 1 (d). (3.1-34) 

The latter condition is expressed in words by saying that d divides B\{d), viz. 
Bi(0) — O p xm, i-e. ordBi(d) > 0. Then, B\{d) is a strictly causal matrix-sequence 
of finite length. Condition i. is necessary and sufficient for A\{d) to be causally 
invertible, viz. for the existence of a causal matrix sequence A^[ 1 (d) such that 

I p = A 1 {d)A^ 1 {d) = A^ 1 (d)A 1 (d) (3.1-35) 

Problem 3.1-2 Let A(d) be a polynomial matrix 

A(d) = A +A 1 d + --- + A n d n 
with Ai g TR pxp . Show that there exists a causal matrix-sequence 

oc 

A- 1 {d) = Y,v k d k , VteR"^ 

k=0 

such that 

I p = A(d)A- 1 (d) = A- 1 (d)A(d) 

if and only if 

A(0) = Ao is nonsingular. 
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It is crucial to appreciate that in general the structural properties of S cannot be 
inferred from the MFDs of H yu (d). In particular, an irreducible MFD of H yu (d) 
does not provide information on the hidden modes of S. In order to make this 
precise, let us define the d- characteristic polynomial x$(cf) of $ 

X*(d) := det(J„ - d$) (3.1-36) 

Then, from Appendix B we have the following result. 

Fact 3.1-1. Consider the system £ = ($,G,H). Let the MFDs (31) and (32) of 
its transfer matrix be irreducible. Then, 

A<AAM det A 2 {d) 
X * {d) = dct^O) = delA^O) (3 - 1 " 37) 

if and only if S is free of nonzero hidden eigenvalues, i.e. controllable and recon- 
structible. 

Problem 3.1-3 Consider the characteristic polynomial of <E> 

X<s>{z) := dct(z/ n - *) (3.1-38) 

Clearly, if dx$(z) denotes the degree of x*( z )i we have <9x$(z) = dim"!" = n. Show that the 
^-characteristic polynomial of <E> is the reciprocal polynomial of x<5 , viz. 

x»(d) = d d **xi(d) m ^ 

= d-x^d- 1 ) (3 - 1_39) 

and that 

dx$(d) = 9x*( 2 ) — no - of zero roots of x<s>( z ) (3.1-40) 

Further, 

X*(«0 = d^Xi^d- 1 ) = d n xl(d) (3.1-41) 
Note that if p(d) is a polynomial, then the roots of its reciprocal polynomial, defined as d dp p*(d), 
equal the reciprocal of the nonzero roots of p(d). 

Problem 3 shows that a necessary and sufficient condition for S to be asymptot- 
ically stable is that be strictly Hurwitz. Since, according to Fact 1, the 
determinants of the denominators of the irreducible MFDs of H yu (d) capture only 
the nonzero unhidden eigenvalues of <i>, a condition for the asymptotic stability of 
S can be stated as follows. 

Proposition 3.1-1. Let the MFDs (31) and (32) of the transfer matrix of the 
system S = (<I>, G, H) be irreducible. Let E be free of unstable hidden modes. Then, 
S is asymptotically stable if and only if A\(d), or equivalently A 2 (d), is a strictly 
Hurwitz polynomial matrix. 

It is customary to speak about stability of transfer matrices in contrast with 
asymptotic stability of state-space representations. A rational function H(d) is 
said to be stable if the denominator polynomial a{d) of its irreducible form H(d) = 
b(d)/a(d) is strictly Hurwitz. We note that, according to this definition, H(d) = 
a(d)y(<j) wnere a (^)i b(d), (p(d) are polynomials in d with a(d) and b(d) coprimc, is 
stable if and only if a(d) is strictly Hurwitz, irrespective of ip(d) . Likewise, a rational 
matrix H(d) = {Hij(d)} is said to be stable if all the denominator polynomials 
aij{d) of its irreducible elements Hij(d) = bij(d)/dij(d) are strictly Hurwitz. This 
is the same as requiring that the irreducible MFDs of H(d) have strictly Hurwitz 
denominator matrices. For this reason, there is no difference between stability 
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Figure 3.2-1: The feedback system. 



of a rational matrix H(d) and stability of its irreducible MFDs. This will be 
consequently reflected in our language in that we shall talk indifferently about 
stability of cither rational matrices or irreducible MFDs. 

Main points of the section Sequences can be described in terms of ^-representations. 
Likewise, time-invariant linear dynamic systems can be described in terms of trans- 
fer matrices and matrix fraction descriptions. Eq. (12) is central in the polynomial 
equation approach to least squares optimization and replaces the usual complex 
integral 




(3.1-42) 



or more generally for u(-),v(-) e £ 2 



Tr(u*(d)v{d)) = 




(3.1-43) 



3.2 Feedback Systems 

Consider the feedback system of Fig. 1 where V and JC denote two discrete-time 
finite dimensional linear time invariant dynamic systems with transfer matrices 
P(d) and, respectively, K(d). In Fig. 1 v(-) and v(-), v{k) e R m and u{k) G W, 
represent two exogenous input sequences, and u(-) and e(-) the sequences at the 
input of V and, respectively, JC. 

We say that the feedback system is well-posed if given any bounded input pair 
w(d) := [ P'(d) v'(d) }' such that 

ordw(d) > -oo (3-2-1) 



the response of the feedback system as given by z(d) :— [ e'(d) u'(d) ]' can be 
uniquely determined. To this end, it is immaterial to specify from which initial 
states for V and JC the input w(-) is first applied. Whenever these initial states are 
unspecified, by default they will be taken to be zero. Accordingly, 



z(d) = w(d) + 



O ri 



P{d) 



-K(d) Or, 



z(d) 



(3.2-2) 
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It follows that the well-posedness condition for the feedback system is that the 
following determinant be not identically zero 



det 



I P ~P{d) 



K(d) 



det 



det 



-P(d) 



Om x p I)n 



K(d)P{d) 
I p + P(d)K(d) O p 

xm 



K(d) 

det[I m + K(d)P(d)} 
det[I p + P(d)K(d)] ^ 



(3.2-3) 



From now on, it is assumed that the p x m rational transfer matrix P(d) is such 
that 

ordP(d)>0 (3.2-4) 

In words, P(d) is a strictly-causal matrix sequence. Further, the m x p rational 
transfer matrix K(d) is assumed to be a causal matrix sequence, viz. 



ord K(d) > 



(3.2-5) 



It follows that ord K(d)P(d) > and ord P(d)K(d) > 0. Consequently, (4) and (5) 
imply well-posedness of the feedback system. 

Definitions 

(3.2-1) The system of Fig. 1 with V and JC satisfying (4) and, respectively, (5) will 
be called the feedback system with plant V and compensator fC. 

(3.2-2) The feedback system is internally stable if the transfer matrix 

H £v {d) H ev (d) 



H zw {d) 



H uv (d) H uv (d) 



(3.2-6) 



is stable. 



(3.2-3) The feedback system is asymptotically stable if the dynamical system re- 
sulting from the feedback interconnection of the dynamical systems V and K. 
is such. 

We see that, in contrast with asymptotic stability, internal stability is the same as 
stability of the four transfer matrices: 



H ev (d) 


= [I p + P(d)K(d)}- 1 

= I P - P(d)[I m + K(d)P(d)}- 1 K(d) 


(3.2-7) 


H £V (d) 


- [I p + P(d)K(d)]- 1 P(d) 
= P(d)[I m +K(d)P(d)]- 1 


(3.2-8) 


H uv {d) 


= -[I m + K(d)P(d)]- 1 K(d) 
= -K^lIp + P^Kid)}- 1 


(3.2-9) 


Huv (d) 


= [I m + K(d)P(d)}- 1 

= I m -K(d)[I p + P(d)K(d)}- 1 P(d) 


(3.2-10) 
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In (7)-(10) all the equalities can be verified by inspection. In fact, (9) can be easily 
established. Then, the last expression in (7) equals I p — P(d)K(d) [I p +P(d)K(d)]~ 1 
which, in turn, coincides with [I p + P(d)K(d)]" 1 . Along the same line, we can check 
(9) and (10). 

A simplification takes place whenever K(d) is a stable transfer matrix. In 
fact, in such a case, instead of checking that four different transfer matrices are 
stable, internal stability of the feedback system can be ascertained by only checking 
stability of H ev (d). The latter is sometimes referred to as external stability of the 
feedback system. 

Proposition 3.2-1. Consider the feedback system of Fig. 1. Let the compensator 
transfer matrix K{d) be stable. Then, a necessary and sufficient condition for 
internal stability of the feedback system is that H sv (d) be stable. 

Proof Stability of H £v (d) is obviously necessary. Sufficiency follows since 



are stable transfer matrices, provided that K(d) is stable. 

If K{d) is unstable, internal stability of the feedback system does not follow 
from external stability, viz. stability of H £V (d). 

Problem 3.2-1 Consider a SISO feedback system with P(d) = B(d)/A(d), A(d) and B(d) 
coprime, B(d) = d(l - 2d), K(d) = S(d)/R(d), R(d) and S(d) coprime, R(d) = (1 - 2d). Assume 
that A(d) + dS(d) is strictly Hurwitz. Show that, though H £v (d) is stable, H ul/ (d) is unstable. 

The following Fact 1 shows that internal stability and asymptotic stability of the 
feedback system are equivalent, whenever V and JC are free of unstable hidden 
modes [Vid85]. 

Fact 3.2-1. Let the plant V and the compensator JC be free of unstable hidden 
modes. Then, the feedback system is asymptotically stable if and only if it is inter- 
nally stable. 

For the sake of brevity, keeping in mind Fact 1 , throughout this book we shall 
simply say that the feedback system is stable whenever it is internally stable and 
V and JC are understood to be free of unstable hidden modes. 

Fact 1-1 holds true for the feedback system. Namely, the ^-characteristic poly- 
nomial of any realization of the feedback system with V and JC free of nonzero 
hidden eigenvalues, is proportional, according to (1-37), to the determinant of the 
denominator polynomial of any irreducible MFD of H zw (d) . From (2) we find 



H EV (d) 
H U v (d) 
H uv {d) 



I p - H sv {d)K(d) 

I m ~ K(d)H ev (d) 

-H uv (d)K(d) 



(3.2-11) 
(3.2-12) 
(3.2-13) 



-P(d) 



-1 



v 



H zw (d) = 



(3.2-14) 



K(d) I, 



m 



Consider the following irreducible MFDs of P(d) and, respectively, K (d) 



P(d) = A^ 1 (d)B 1 (d) = B 2 {d)A 2 1 {d) 
K{d) = R^ 1 (d)S 1 (d) = S 2 (d)R^ 1 (d) 



(3.2-15) 



(3.2-16) 
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We find 



-P(d) 



K(d) 



R^(d)S 1 (d) 
Ai\d) C 



(3.2-17) 



Then 



H zw {d) 



Ri\d) 



A^d) -Bi(d) 

Si{d) Ri(d) 
Ra{d) 







A 2 (d) 



Atid) -Bi(d) 

Si(d) J2i(d) 

ili(d) 

i?i(d) 
R 2 (d) -B 2 (d) n 1 

S 2 (d) A 2 {d) 



(3.2-18) 



where the first equality follows from (14), and the second equality is obtained in a 
similar way by using the right coprimc MFDs of P{d) and K(d). 

Problem 3.2-2 Show that the two MFDs of H zw (d) in (18) arc irreducible. [Hint: The left 
MFD A~ 1 {d)B(d) is irreducible if and only if A(d) and B(d) satisfy the Bezout identity (B.10). ] 

According to Problem 2, external stability of the feedback system is equivalent to 
strict Hurwitzianity of the polynomial denominators of (18). We have 
" A^d) -B^d) ~ 



det 



Slid) Ri(d) 



= detfli(d)det Ai(d) + B 1 (d)RY 1 (d)S 1 (d) 

= detRi (d) det [ A x (d) + B x (d) S 2 id) R^ 1 (d) 
det Riid) 



det R 2 id) 
det i?i(0) 
det P 2 (0) 



det 



det 



Ai(d)fl 2 (d) + Bi(d)5 2 (d) 
A^d)R 2 id) + Pi(d)S 2 (d) 



(3.2-19) 



where the first equality holds being Pi(cf) nonsingular [Kai80, p. 650], and the last 
follows from (1-37). The conclusions of next theorem then follow at once. 

Theorem 3.2-1. Consider the feedback system of Fig. 1 with plant and compen- 
sator having the irreducible MFDs (15) and (16). Then, the feedback system is 
internally stable if and only if 



Pi(d) := A^d)R 2 id) + £i(d)S 2 (d) 



(3.2-20) 



or equivalently 



P 2 id) := R^d)A 2 id) + Si(d)B 2 (d) (3.2-21) 

are strictly Hurwitz. Further, the d- characteristic polynomial x*W °f an V realiza- 
tion of the feedback system with V and K. both free of nonzero hidden eigenvalues, 
is given by 

det Pi (d) detP 2 (ri) 
X * {d) = detP(0) = detP^O) (3 ' 2 " 22) 
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6(d) _ 
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Problem 3.2-3 Consider the plant 

A(d)y(d) = B(d)z{d) + C(d)e{d) 

where [ z 1 e! ] =: u denotes a partition of the input u into two separate vectors z and e. Let 
A~ 1 (d) [ B(d) C(d) ] be an irreducible left MFD. Show that a necessary condition for the 
existence of a compensator 

M-\d)y(d) 

which makes the feedback system internally stable is that the greatest common left divisors of A(d) 
and B(d) are strictly Hurwitz. Further, show that, under this condition, for the feedback system 
internal stability coincides with asymptotic stability if and only if the compensator is realized 
without unstable hidden modes and all the unstable eigenvalues of the actual plant realization are 
the same, counting their multiplicity, as those of the minimal realization of H yu (d) or, cquivalcntly, 
the reciprocals of the roots of dot A(d). 

We see from Theorem 1 that if the feedback system is internally stable, (20) and 
(21) can be rewritten as follows 

I p = A 1 (d)M 2 (d) + B^N^d) (3.2-23) 

I m = M 1 {d)A 2 {d) + N 1 (d)B 2 (d) (3.2-24) 

where 

M 2 (d) := R 2 (d)P^(d) N 2 (d) := S 2 (d)Pf 1 (d) (3.2-25) 

Mi(d) := P 2 1 (d)R 1 (d) N^d) := P 2 - 1 (d)S 1 (d) (3.2-26) 

are stable transfer matrices. We note that the transfer matrix of the controller can 
be written as the ratio of the above transfer matrices 

K(d) = N 2 (d)M 2 1 (d) (3.2-27) 
= M^\d)N 1 (d) (3.2-28) 

This representation for the controller transfer matrix is not only necessary for the 
feedback system to be internally stable. It also turns out to be sufficient as well. 

Theorem 3.2-2. Consider the feedback system of Fig. 1 and the irreducible MFDs 
(15) of the plant. Then, a necessary and sufficient condition for the feedback system 
to be internally stable is that the compensator transfer matrix be factorizable as in 
(27) (equivalently, (28)) in terms of the ratio of two stable transfer matrices M 2 (d) 
and N 2 (d) (M\(d) and N\{d)) satisfying the identity (23) ((24))- Conversely, a 
compensator with a transfer matrix factorizable as in (27) ((28)) in terms of M 2 (d) 
and N 2 (d) (Mi(d) and N 2 (d)) satisfying (23) ((24)) makes the feedback system 
internally stable if and only if M 2 {d) and N 2 {d) (M\{d) and N\{d)) are both stable 
transfer matrices. 

Proof That the condition is necessary is proved by (23)-(28). Sufficiency is proved next by 
showing that the condition implies stability of the transfer matrices H uv (d), H EV (d), H uv (d), 
H ev {d). 

Using (7)-(10) we find: 

H uv (d) = [I m +K(d)P(d)}- 1 

= [I m +M- 1 (d)N 1 (d)B 2 {d)A- 1 (d)}-^ 

= A 2 (d)[M 1 (d)A 2 (d) + N 1 (d)B2(d)}- 1 M 1 (d) K-i-^) 

= A 2 (d)Mi(d) [(16)] 

H uv (d) = -[I m + K(d)P(d)]- 1 K(d) 

= -A 2 (d)M 1 (d)M- 1 (d)N 1 (d) [(29)] (3.2-30) 
= -A 2 (d)JVi(d) 
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H ev (d) = [I p + P(d)K{d)]- 1 

= M2{d)[A l {d)M 2 {d) + B l {d)N 2 {d)}- 1 A 1 {d) ^.z-a±; 
= M 2 (d)A 1 (d) [(23)] 

H ev (d) = [I p + P(d)K(d)]- 1 P(d) 

= M 2 (d)A 1 (d)A- 1 (d)B 1 (d) [(31)] (3.2-32) 
= M 2 (d)B 1 (d) 

We see that since Mi(d), Ni(d), M 2 (d) are stable transfer matrices, (29)— (32) are such. 

Problem 3.2-4 Consider the feedback system of Fig. 1 and (29)-(32). Show that H lv (d) = 
—H uv (d) and H yv (d) = H ev (d) imply 

N 2 (d)A!(d) = A 2 (d)N!(d) (3.2-30a) 

B 2 (d)M 1 (d) = M 2 (d)B 1 (d) (3.2-32a) 

Further, verify that H £l/ (d) = H yl/ {d) + I p and H uv (d) = —H JV (d) + I m imply 

I p = M 2 (d)A!(d) + B 2 (d)Ni(d) (3.2-23a) 

I m = A 2 (d)Mi (d) + AT 2 ((i)Bi(d) (3.2-24a) 

It is to be pointed out that given irreducible MFDs, of P(d) as in (15), (23) and 
(24) can be always solved w.r.t. (M 2 (d), N 2 (d)) and, respectively, (Mi(d), Ni(d)). 
In fact, since Ai(d) and Bi(d) are left coprime, there are polynomial matrices 
(M2(d), N2(d)) solving the Bezout identity (23). Now polynomial matrices in d are 
stable transfer matrices representing impulse-response matrix-sequences of finite 
length. Let, then, (M 2 o(d)) , N 2 o(d)) and (Mi (d), Ni (d)) be two pairs of stable 
transfer matrices solving (23) and, respectively, (24). It follows from the pertinent 
results of Appendix C that all other stable transfer matrices solving (23) and, 
respectively, (24) are given by 



M 2 (d) = M 20 {d) - B 2 {d)Q{d) 
N 2 {d) = N 20 (d) + A 2 (d)Q(d) 



(3.2-33) 



and 



M 1 (d) = M 10 (d) - Q(d)B 1 (d) \ 

JVi(d) = N 10 (d) + Q(d)A 1 (d) / [6 - Z - M) 

where Q(d) is any m x p stable transfer matrix. Summing up, we have the following 
result. 

Theorem 3.2-3 (YJBK Parameterization). Consider the feedback system of 
Fig. 1 and the irreducible MFDs (15) of the plant. Then, there exist compensator 
transfer matrices K{d) as in (27) and (28) which make the feedback system inter- 
nally stable. Given one such a transfer matrix 

K (d) = N 20 (d)M 2Q \d) (3.2-35) 
= M w 1 (d)N 10 (d) (3.2-36) 

with (M 20 (d), N 20 (d)) and (Mi (d), Ni (d)) two pairs of stable transfer matrices 
satisfying (23) and, respectively, (24), all other transfer matrices K(d) which make 
the feedback system internally stable are given by 

K{d) = [N 2a (d) + A 2 (d)Q(d)] [M 20 (d) — B 2 (d)Q(d)]~ 1 (3.2-37) 
= [M 1Q (d) - Q(d)B 1 (d)}- 1 [N 10 (d) + Q(d)A 1 (d)] (3.2-38) 



where Q(d) is any m x p stable transfer-matrix. 
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Eq. (37) and (38) give the YJBK parameterization of all K(d) which make 
the feedback system internally stable. Here, the acronym YJBK stands for Youla, 
Jabr and Bongiorno [YJB76] and Kucera [Kuc79], who first proposed the set of all 
stabilizing compensators in the Q-parametric form. 

Example 3.2-1 Let 

p{d) = T^b- (3 - 2 - 39) 

Here, A(d) = 1 — 4d 2 and B(d) = 4d arc coprimc polynomials. Eq. (23), or (24), becomes 

(1 - 4d 2 )M(d) + 4dN(d) = 1 (3.2-40) 

This is a Bezout identity having polynomial solutions (Mo(d), No(d)). This solution can be made 
unique by requiring that either dMo(d) < 1 or dNo(d) < 2, dp(d) denoting the degree of the 
polynomial p(d). The minimum degree solution w.r.t. Mo(d), viz. the one with dMo(d) < 1, can 
be easily computed by equating the coefficients of equal powers of the polynomials on both sides 
of (40). We get 

Afo(d) = 1 and N (d) = d (3.2-41) 

Then, (37), or (38), gives the YJBK parametric form of all compensator transfer functions making, 
for the given P(d), the feedback system internally stable 

d+(l- 4d 2 )Q(d) 

K(d) = — !-i (3.2-42 

^ ; l-4dQ(d) V ' 

dS(d) + (1 - 4d 2 )n(d) 
6(d) - 4dn(d) 

where Q(d) = n(d)/S(d) with n(d) any polynomial and 5(d) any strictly Hurwitz polynomial. 

Next problem shows that the characteristic polynomial of the feedback system can 
be freely assigned by suitably selecting the denominator matrix of the MFDs of 
Q(d). 

Problem 3.2-5 Consider a YJBK parameterized compensator with transfer matrix K(d) as in 
(37). Let (M2o(d), N2o(d)) be a pair of polynomial matrices satisfying (23). Write Q(d) in the 
form of a right coprimc MFD Q(d) = L 2 (d)D~ 1 (d). Show then that 

K(d) = [N 20 (d)D 2 (d) + A 2 (d)L 2 (d)] [M 20 (d)D 2 (d) - B 2 (d)L 2 (d)]~ 1 , 
and Pi(d) in (20) equals D 2 (d). 

Another feature of the YJBK parameterization which turns out to be useful in 
optimization problem [Vid85] is that the transfer matrices H uv (d), H uu (d), H £V (d) 
and H £V (d), as shown in the proof of Theorem 3.2-2, are, respectively, linear in 
Mi(d), iVi(d), M 2 (d). It follows from (33) and (34) that all the above transfer 
matrices are affinc in the Q(d) parameter. 

By (38) all control laws making the feedback system internally stable can be 
written as follows 

7(d) = -K (d)i(d) 

-M w \d)Q(d)A 1 (d)[e(d) - A^ 1 (d)B 1 (dn(d)} {6 ^ 6) 

Fig. 2 depicts the feedback system with a YJBK parameterized compensator as in 
(43). As can be seen, the "outer" loop with output feedback —Ko(d) is increased 
by an "inner" loop where y is the difference between the (disturbed) plant output 
£ and the output from the plant model A^ 1 (d)Bi(d). If A\(d) is strictly Hurwitz, 
(24) is solved by N w (d) = mxp and M w {d) = A^ 1 (d). Then, the scheme of Fig. 2 
can be simplified as in Fig. 3 where we set 



Q(d) ~A^ 1 (d)Q(d)A 1 (d) 



(3.2-44) 
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-A^ 1 (d)B 1 (d) 



-M^{d)Q{d)A 1 {d) 



-6) 



-K (d) 



Figure 3.2-2: The feedback system with a Q-parameterized compensator. 



7 



-Q(d) 



Figure 3.2-3: The feedback system with a Q-parameterized compensator for P(d) 
stable. 
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Note that since Q(d) is any stable m x p transfer matrix, Q(d) is any mx p stable 
transfer matrix as well. The scheme of Fig. 2 has been advocated [MZ89a] to be 
particularly advantageous in process control applications, where P(d) turns to be 
stable. 

We now turn from internal to asymptotic stability. From Fact 1 and Theorem 
1, it follows at once the following result. 

Theorem 3.2-4. There exist compensators making the feedback system of Fig. 1 
asymptotically stable if and only if the plant is free of unstable hidden modes. All 
such compensators are realizations free of unstable hidden modes of the YJBK pa- 
rameterized transfer matrices K(d) (37) or (38). 

Main points of the section Feedback systems can be studied by using matrix 
fraction descriptions and properties of polynomial matrices. These tools can be used 
so as to nicely identifying all feedback compensators, in the YJBK parameterized 
form, which stabilize the plant. 

3.3 Robust Stability 

We consider again the feedback configuration of Fig. 2-1 where the plant is described 

by 

y(d) = P(d)u(d) (3.3-1) 
with P(d) the true or actual plant transfer matrix, and 

u{d) = -K(d)y(d) (3.3-2) 

with K(d) the compensator transfer matrix. We assume that the compensator is 
designed based upon the nominal plant transfer matrix P°(d) but applied to the 
actual plant P(d). 

The robust stability issue is to establish conditions under which stability of the 
designed closed loop implies stability of the actual closed loop. In order to address 
the point, it is convenient to let 

G(d) := K(d)P(d) (3.3-3) 
G°(d) := K(d)P°(d) (3.3-4) 

G(d) (G°(d)) is referred to as the actual (nominal) loop transfer matrix, viz. the 
transfer matrix of the plant/compensator cascade. We also assume that the actual 
plant transfer matrix equals the nominal plant transfer matrix postmultiplicd by a 
multiplicative perturbation matrix i. e. we have 

P(d) = P°(d)M(d) (3.3-5) 

Hence, 

G{d) = G°{d)M(d) (3.3-6) 

We note that the relative, or percentage, error of the loop transfer matrix can be 
expressed in terms of the multiplicative perturbation matrix M(d). In fact, 

G^id^Goid) - G{d)} = M~ 1 {d) [G°(d)] _1 [G°(d) - G°(d)M(d)] 

= M~\d)-I m (3.3-7) 



•54 



I/O Descriptions and Feedback Systems 



For a complex square matrix A, denote by a (A) and g_(A) the maximum and the 
minimum singular value of A, i.e. 

a(A):=+\ 1 n (^(A*A) and a(A) := +\^(A*A) (3.3-8) 

where the star denotes Hcrmitian, i.e. A* is the complex-conjugate transpose of 
A. Further, we denote by f2 the contour in the complex plane consisting of the 
unit circle, suitably indented outwards around the roots of the open-loop poles of 
G°(d) on the circle. 

We can now state the result on robust stability [BGW90] which will be used in 
Sect. 4.6. 

Fact 3.3-1. Denote by: Xoi{d), X t{d) the d- characteristic polynomial of the actual 
and, respectively, nominal open-loop system; and Xcl{d), X° c i{d) the characteristic 
polynomial of the actual and, respectively, nominal closed-loop system. Let: 

• Xot{d) and Xoe(d) have the same number of roots inside the unit circle; 

• Xoe(d) and X^(cf) have the same unit circle roots; 

• Xcl(d) * s strictly Hurwitz. 

Then Xcl {d) is strictly Hurwitz provided that at each d G il 

a(M^ 1 (d) - I m ) < mm(a(d), 1) (3.3-9) 
a{d):=a{I m + G°{d)). (3.3-10) 

The transfer matrix I rn + G°(d) is called the return difference of the nominal 
loop and (9) and (10) show that it plays an important role in robust stability. The 
importance of Fact 1 is that it shows that the feedback system of Fig. 2-1 remains 
stable when the relative error of the loop transfer matrix caused by multiplicative 
perturbations is small as compared to the nominal return difference. 

We shall now use a different argument to point out a more generic and somewhat 
less direct result on robust stability. Namely, we shall show that any nominally 
stabilizing compensator yields robust stability, being capable of stabilizing all plants 
in a nontrivial neighborhood of the nominal one. This is a result of paramount 
interest which is worth studying in some detail for its far-reaching implications. To 
see this, we first point out (Problem 2.4-5) that the output dynamic compensation 
(2) can be looked at as a state-feedback compensation 

u{k)=Fx{k) (3.3-11) 

provided that x(k) be a plant state made up by a sufficient number of past input 
output pairs 

x(k) := [y'(t - n + 1) ■ ■ ■ y'(t) u'(t - n + 1) ■ ■ ■ u'(t - 1)]'. (3.3-12) 

Further, if the nominal plant is stabilizable and detectable neglecting its possible 
stable hidden modes of non concerns to us, it can described as follows 

x(k+l) = $°x(k)+G°u(k) 

y(k) = Hx(k) (3.3-13) 
H = [ Op X („_i)p Ip Op X ( n _i) m ] 
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We assume that the nominal closed loop system (11)— (13) is asymptotically stable, 
viz. 

:= $° + G°F (3.3-14) 

is a stability matrix. Then, there are positive reals 7 > 1 and < A < 1 such that, 
for k = 0, 1, • • 

m° cl ) k \\<^ k =:^ c i(k) (3.3-15) 
where | ■ | denotes any matrix norm. Consider time varying perturbed plants 

x(k + 1) = [$° + §(k)]x(k) + [G° + G(k)]u(k) (3.3-16) 

such that the perturbations $(fc) and G(k) belong to the sets 

e {*(fc) e R 7VxAr I ||^(fc)|| < ^} (3.3-17) 

G(k) e {C(fc) eR Arxm | ||G(fc)|| <g} (3.3-18) 

where N :— dim a;, and dp and g are positive reals. Then, we have the following 
result. 

Theorem 3.3-1. (Robust Stability of State Feedback Systems) Consider 
the time-varying perturbed plants (16) with a fixed state-feedback compensation (11) 
such that the nominal closed-loop system with transition matrix (14) is asymptoti- 
cally stable. Then, for all perturbed plants in the neighborhood of the nominal plant 
specified by the sets (17) and (18), the closed-loop system remains exponentially 
stable, whenever 

1 -{ip + g\\F\\)<\ (3.3-19) 



1 - y 

or 

|KV-)lli(^ + 7imi)<l (3.3-20) 

OO 

where \\w% (-)||i := £ \w° cl (k)\. 

fe=0 

In order to prove Theorem 1, we avail of the following lemma. 

Lemma 3.3-1. (Bellman Gronwall [Des70a]) Let {z(k)}j^ be a nonnegative 
sequence and m and c two nonnegative reals such that 

fe-i 

z(k) < c+^mz(i) (3.3-21) 



i=0 



Then, 
Proof Let 

It follows that 
Then 



(fc) < c(l + m) fc (3.3-22) 



z 



k-1 

h(k) := c+ £ mz(i) (3.3-23) 
i=0 

h(k)-h(k-l) = mz(k-l) 

< mh(k - 1) [(21), (23)] 



h(k) < (l + m)h(k-l) 
< (l + rn) k h(0) 
= c{l + m) k 



•5G 



I/O Descriptions and Feedback Systems 



Proof of Theorem 1. By defining <£> c ((fc) := <J>(fc) + G(k)F, the perturbed closed-loop system is 
x(fc + l) = *^x(fc) + *ei(fc)x(fc) , fc = 0,l,--- 

Then, 

k — 1 

*(*) = (*?,)* *(0) + E(*d)*" 1_i *c«( < M0 

i=0 

From (15), (17) and (18), it follows that 

fc-l 

\\x{k)\\ < 7 A fc ||x(0)|| + 7 A fc A- 1 (^ + g\\F\\) £ A- l ||x(i)|| 

i=0 

or 

fc-l 

z(fc) < c + mz(i) 
i=0 

with z(i) := A - '||x(i)||, c := 7||x(0)|| and m := 7 A~ 1 (^ + 7||-F||). Then, by virtue of Bcllman- 
Gronwall Lemma, 

z(k) = \- k \\x(k)\\ < c(l + m) k 

= 7lN(0)||[H-7A- 1 (^ + ff||F||)]'= 

or 

\\x(k)\\ <l\\x(0)\\ [A + 7(v + 7 ||^ll)] fc 
The conclusion is that, as k — » oo, ||x(fe)|| tends exponentially to zero, whenever (19) is fulfilled. 

Note that the nonnegative sequence {w^k)}^^ in (15) can be interpreted as 
the impulse response of the SISO first order system 

+ = A£(fc) + 7 Ai/(fc) 
= ^{k)+ 1 u{k) 

According to (15), w Q cl {k) upperbounds the norm of the fc-th power of the state- 
transition matrix of the nominal closed-loop system. The more damped the latter, 
the smaller ||w°/(')lli can be made, and, by (20), the larger the size, as measured 
by <p + ^11^11, of plant perturbations which do not destabilize the feedback system. 

Main points of the section For multiplicative perturbations affecting the plant 
nominal transfer matrix, robust stability of the feedback system can be analyzed 
via (9) and (10) by comparing the size of the relative error (7) of the loop transfer 
matrix with the size of the return difference of the nominal loop. A more generic and 
qualitative result, based on the bare fact that any output dynamic compensation 
can be looked at as a state feedback compensation, is that any nominally stabilizing 
compensator yields robust stability, being capable of stabilizing all plants in a 
nontrivial neighborhood of the nominal one. 

3.4 Streamlined Notations 

It is convenient in several instances to depart from the notational conventions 
adopted so far for representing time invariant linear systems in I/O form. An I/O 
description of such systems was given in terms of (^-representations of sequences 
and MFDs of transfer functions as y(d) = A~ l {d) [B(d)u(d) + T(d)] or 



A(d)y(d) = B(d)u(d) + T(d) 



(3.4-1) 
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where T(d) is a polynomial vector depending upon the initial conditions (Cf. (1- 
27)). This is an I/O global description, in that it allows us to compute the whole 
system output response y{d) once the whole input sequence u{d) and the initial 
conditions are assigned. 

However, (1) bears upon an I/O local representation as well. This can be seen 
as follows. Let 

A(d) = I P + A 1 d+--- + A dA d 9A (3.4-2) 
B{d) = B 1 d+--- + B dB d dB (3.4-3) 
Then, if y(d) = ^2^ =0 y(k)d k and u(d) — J2kLo u (k)d k , (1) can be written as the 

following difference equation 

y(t) + A lV (t - 1) + • • • + A dA y(t - OA) = B lU (t - 1) + • • • + B dB u(t - OB) (3.4-4) 

for every t, such that t > max (dA(d), dB(d), dT(d)). 

This, in turn, can be rewritten in shorthand form as follows 

A(d)y(t) = B{d)u(t) (3.4-5) 

The last equation is the same as (4), rewritten in streamlined notation form. 

Instead of interpreting d as a position marker as in (1), in (5) d is used as the unit 
backward shift operator, viz. dy(t) = y(t — 1). The reader is warned not to believe 
that (5) implies that the output vector y(t) can be computed by premultiplying the 
input vector u(t) by the matrix A~ 1 (d)B(d). The I/O representation (5) is handy 
in that, though it gives no account of the initial conditions, is appropriate for both 
stability analysis, and synthesis purposes. 

Main points of the section For both study and design purposes, it is convenient 
to adopt system local I/O representations in a streamlined notation form as (5), 
where d plays the role of the unit backward operator. 

Problem 3.4-1 Consider the difference equation 

G(d)r(t) = 

with 

r(t) £ R and G(d) = 1 + gid H h g n d n . 

Set 

OC 

f(d) = r(k)d k and x(0) := [r(-n + 1) • • • r(-l) r(0)}' . 

fc=0 

Show that 

f(d) = T{d)/G(d) 

for a suitable polynomial T(d). [Hint: Put r(t) in state-space representation. ] 

3.5 1-DOF Trackers 

The tracking problem consists of finding inputs to a given plant so as to make 
its output y(t) £ R p as closest as possible to a reference variable r(t) € R p . 
Specifically, it is required to design a feedback control law which, while stabilizes 
the closed-loop system, as t — > oo reduces to zero the tracking error 



e(t) := y(t) - r(t) 



(3.5-1) 
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r(t) ►© 



-e(t) 



K(d) 


«(*) 


P(d) 





y(t) 



Figure 3.5-1: Unity-feedback configuration of a closed-loop system with a 1-DOF 
controller. 



r(t) 



K(d) 


u(t) 


P(d) 





»(*) 



Figure 3.5-2: Closed-loop system with a 2-DOF controller. 



irrespective of the initial conditions. Whenever this happens, we say that asymp- 
totic tracking is achieved or, in case of a constant reference, that the controller is 
offset free. Typically, the design is carried out by assuming that r(t) either belongs 
to a family of possible references or is preassigned. 

Basically, there are two alternative approaches to solve the tracking problem. 
In the first, e(t) is the only input to the controller. For this reason the latter is 
sometimes referred to as a one- degree- of -freedom (1-DOF) controller or tracker. 
As shown in Fig. 1, in such a case the closed-loop system results in a unity-feedback 
configuration. 

In the second approach, depicted in Fig. 2, the controller processes two separate 
inputs, viz. y(t) and r(t), in an independent fashion. For this reason, it is sometimes 
referred to as a two-degrees-of-freedom (2-DOF) controller or tracker. 

While 2-DOF controllers will be discussed in future chapters, we focus hereafter 
on how to solve the asymptotic tracking problem by 1-DOF controllers. Specifically, 
we show how to embed such a problem in the one of stabilizing a feedback system. 

We consider a plant with the same number of inputs and outputs 

A(d)y(t) = B(d)u(t) 
dimy(t) = dimw(i) = p 

with A~ 1 (d)B(d) a lcft-coprime MFD of the plant transfer matrix P(d). Note that 
in (2) we use the streamlined notation of Sect. 4. Further, we assume that the 
reference r(t) is such that 

D(d)r(t) = O p (3.5-3) 

for some polynomial diagonal matrix D(d) such that -D(O) = I p and whose elements 
have only simple roots on the unit circle. This amounts to assuming that r(t) is a 
bounded periodic sequence. E.g., if D{d) = 1 — d, (3) yields r(t) — r(t — 1) = 0, i.e. 
r(-) is a constant sequence. 

Problem 3.5-1 (Sinusoidal Reference) Consider a polynomial D(d) with roots e je and e -je , 
6 S [0, 7r]. Find the corresponding difference equation (3) for r(t). Plot r(t) as a function of t, 
t = 0, 1, ■ ■ ■ , for various 6, assuming that r(— 2) = r(— 1) = 1. 



(3.5-2) 
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r(t) — S 



-e(t) 



S 2 {d)R2\d) 





D-\d) 


u(t) 







A- 1 {d)B{d) 



y(t) 



D- 1 {d)S 2 {d)R^ 1 {d) 



Figure 3.5-3: Unity -feedback closed loop system with a 1-DOF controller for 
asymptotic tracking. 



We now combine (2) and (3) so as to get a representation for e(t) in terms of u(t). 
This can be achieved by, first, premultiplying (2) by D(d) and (3) by A(d) and, 
next, subtracting the second from the first. Accordingly, 



A(d)D(d)e(t) = B{d)v(t) 
v(t) := D(d)u(t) 



(3.5-4) 
(3.5-5) 



Eq. 4 defines a new plant with input v{t) and output e(t). If we can find a com- 
pensator which stabilizes (4), then 



e(t) 



(t->oo) 



Or, 



and 



"(*) 



Assuming that D(d) and B(d) are left coprime, there exist right coprime polynomial 
matrices R 2 {d) and 82(d) such as to make 



det [A(d)D(d)R 2 (d) + B(d)S 2 (d)} 



(3.5-6) 



strictly Hurwitz. According to Theorem 2-1, (6) gives, apart from a multiplicative 
constant, the characteristic polynomial of the feedback system of Fig. 3 consisting 
of the plant (4) and the dynamic compensator 



D(d)u{t) = v(t) = -S 2 {d)R^{t)E{t) 



(3.5-7) 



Note that the transfer matrix of the compensator is D^ 1 (d) S 2 (d) R^ 1 (d) . Hence 
it embodies the reference model (7). This result is in agreement with the so-called 
Internal Model Principle [FW76] according to which, in order to possibly achieve 
asymptotic tracking, the compensator has to incorporate the model of the reference 
to be tracked. In the case of a constant reference, one should have (1 — d) at 
the denominator of the compensator transfer function, so as to insure offset free 
behaviour. This yields integral action as commonly employed in control system 
design. In this connection, Cf. also Problem 2.4-11. 

Problem 3.5-2 Show that, if the feedback system of Fig. 1 is internally stable, the plant steady- 
state output response to a constant input u equals P(l)u. Conclude then that, if the plant has no 
unstable hidden modes, asymptotic tracking of an arbitrary constant reference vector is possible 
if and only if rankP(l) = p. In turn, this implies that dimn(t) > p. Then, conclude that the 
condition dimu(t) = dimy(t) in (2) entails no limitation. 



Problem 3.5-3 Show that, if the polynomial matrix D(d) in (3) is a divisor of A(d) in (2), 
one can write A(d)e(t) = B(d)u(t). Conclude that in such a case, since all modes of the reference 
belong also to the plant, any stabilizing 1-DOF compensator yields asymptotic tracking. 
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Theorem 3.5-1 (1— DOF Tracking). Consider a square plant, viz. 

dimy(i) = dini'u(i), 

free of unstable hidden modes and with a left coprime MFD A^ 1 {d)B{d). Let r(t) 
be a bounded periodic reference modelled as in (3). Then, asymptotic tracking is 
achieved by any stabilizing compensator of the form (7), making the closed-loop 
system internally stable. Such compensators exist if and only if D(d) and B{d) are 
coprime. Further, if D(d) is a divisor of A(d), any stabilizing 1-DOF compensator 
yields asymptotic tracking. 

Next problem shows that the output disturbance rejection problem is isomorphic 
to the output reference tracking problem. 

Problem 3.5-4 Consider the square plant y(t) = A- 1 (d)B(d)u{t) + n(t) with A(d) and B(d) 
left coprime, and n(t) an output disturbance such that D(d)n(t) = O p , with D(d) as in (3). Show 
that if the feedback compensator 

i/(t) := D(d)u{t) = -S 2 (d)R2 1 (d)y(t) 

is such that (6) is strictly Hurwitz, then it yields asymptotic disturbance rejection, viz. 

y(t) > O p and u(t) > O p 

(t— »oo) (t— *oo) 

whenever D(d) and B(d) are coprime. 

Main points of the section The asymptotic tracking problem can be formulated 
as an internal stability problem of a feedback system. Under general conditions, 
asymptotic tracking can be achieved by 1-DOF controllers embodying the modes 
of the reference model. 

Problem 3.5-5 (Joint Tracking and Disturbance Rejection) Consider the plant of Problem 4. 
Assume that the disturbed output y(t) has to follow a reference r(t) such that G(d)r(t) = O p with 
G(d) a polynomial matrix with the same properties as D(d) in (3). Let L(d) be the least common 
multiple of G(d) and D(d). Determine the conditions under which the feedback compensator 

u(t) := L(d)u{t) = -S 2 (d)R2 1 (d)e(t) 

yields both asymptotic tracking and disturbance rejection. 

Notes and References 

Appendix B provides a quick review of the results of polynomial matrix theory used 
throughout the chapter. Our notations and definitions mainly follows [Kuc79]. 

The formulation of the feedback stability problem first appeared in 
[DC75]. It has been widely used since, e.g. [Vid85]. [DC75] also includes examples 
showing that any three of the blocks of H zw (d) can be stable while the fourth is 
unstable. 

The parametric form of all stabilizing controllers appeared in [YJB76] for the 
continuous-time case, and in [Kuc75] and [Kuc79] for the discrete-time case. A 
subsequent version of the parameterization using factorizations in terms of stable 
transfer matrices was first introduced by [DLMS80], and used since, e.g. [Vid85]. 

The robust stability result of Sect. 3.3 follows the adaptation of the approach 
in [LSA81] to the discrete-time case as reported in [BGW90]. The generic robust 
stability result for discrete-time feedback systems is reported in [CMS91]. A similar 
result for continuous-time state-feedback systems was presented in [CD89] . 
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DETERMINISTIC LQ 
REGULATION - II 
Solution via Polynomial 

Equations 

In this chapter another approach for solving the steady-state LQOR problem is 
described. This, which will be referred to as the polynomial equation approach 
[Kuc79] , can be regarded as an alternative to the one based on Dynamic Program- 
ming and leading to Riccati equations. 

The reader may wonder why we should get bogged down with an alternative 
approach once we have found that the one using Riccati equations leads to efficient 
numerical solution routines. The answer is manifold. First, from a conceptual 
viewpoint it is beneficial to appreciate that the Riccati-based solution is not the 
only way for solving the steady-state LQOR problem. As we shall see soon, this 
holds true even if the plant is given in a state-space representation as in (2.4-44). 
Second, the polynomial equation approach backs up the one based on Dynamic 
Programming, offering complementary insights. E.g., in the polynomial equation 
approach the eigenvalues of the closed-loop system show up in a direct way. This 
can be seen as a consequence of being the polynomial equation approach basically 
a frequency-domain methodology in contrast with the time-domain nature of Dy- 
namic Programming. Third, the polynomial equation approach can be extended 
to cover steady-state LQ stochastic control as well as filtering problems. This also 
yields additional insights to the ones offered by stochastic Dynamic Programming. 
E.g., as will be seen in due time, the polynomial solution to the LQ stochastic servo 
problem provides a nice clue on how to realize high performance two degrees-of- 
freedom servo-controllers. 

The polynomial equation approach to the steady-state deterministic LQOR 
problem will be discussed by disaggregating the required steps so as to emphasize 
their specific role. One reason is that, mutatis mutandis, the same steps can be 
followed to solve via polynomial equations other LQ optimization problems, such 
as linear minimum mean-square error filtering and steady-state LQ stochastic reg- 
ulation. The advantage of introducing the basic tools of the polynomial equation 
approach to LQ optimization at this stage is twofold: first, we can nicely relate 
them to those pertaining to Dynamic Programming; and, second, presenting them 
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within the simplest possible framework, maximize our understanding of their main 
features. 

The chapter is organized as follows. Sect. 1 shows that the polynomial approach 
to the steady-state deterministic LQ regulation amounts to solving a spectral fac- 
torization problem, and decomposing a two-sided sequence into two additive se- 
quences of which one causal and the other strictly anticausal and possibly of finite 
energy. The latter problem is addressed in Sect. 2 and consists of finding the solu- 
tion to a bilateral Diphantine equation with a degree constraint. In Sect. 3 we show 
that stability of the optimally regulated system requires to solve a second bilateral 
Diophantine equation along with the one referred above. Sect. 4 proves solvability 
of these two bilateral Diophantine equations under the stabilizability assumption 
of the plant. 

4.1 Polynomial Formulation 

Consider a time-invariant linear state-representation of the plant to be output 
regulated 

x(t+l) = +Gu(t) 
y{t) = Hx(t) 

The problem is to find, whenever it exists, an input sequence 



(4.1-1) 



«(■) = { ; u(0),u(l),--- } (4.1-2) 
minimizing the quadratic performance index 



J := J{x{0),u [0iOo) ) - £[lb(AOII^ + IK*OII?, 

CO 

= E [Mmi x + \Hk)\\i 



(4.1-3) 



fe=0 

for any initial state x(0). In (3), ip x := H'ip y H and 

%/j y = i/j y >0 and ip u = ip'u>0 (4.1-4) 

As in Sect. 3.1, we use the ^-representations u(d) and x(d) of the sequences u(-) 
and, respectively, x(-). In particular, (3.1-26) gives 

x(d) = A^id) [jc(0) + B(d)u(d)] (4.1-5) 

where A(d) and B(d) are the following polynomial matrices 

A(d) := J-d$ (4.1-6) 
B(d) := dG (4.1-7) 

Exploiting (3.1-12), (3) can be rewritten as 

J = {y*{d)^ y y{d)+u*(d)^ u u(d)) (4.1-8) 
= (x* (d)tp x x(d) + u* (d)ip u it(d)) 

Let us introduce also a right coprimc MFD B 2 (d)A2 1 (d) of the transfer matrix 
H xu {d) 

H xu {d) = A- 1 (d)B{d) = B 2 {d)A- 2 \d) (4.1-9) 
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Then, (5) can be rewritten as 

x(d) = A- 1 (d)x(0) + B 2 {d)A 2 1 {d)u{d) 
Substituting (10) into (8), we get 

J 



(4.1-10) 



(u*{d)A 2 *{d) [A* 2 (d)^ u A 2 (d) + B* 2 (d)^ x B 2 (d)\A^(d)u(d) + 

u*(d)A 2 *(d)B^(d)ip x A- 1 (d)x(0) + 
x , (0)A-*(d)^ x B 2 (d)A 2 - 1 {d)u(d) + 

x , (0)A-*(d)tp x A- 1 (d)x(0)) (4.1-11) 

where we used the shorthand notation A 2 *{d) := [A 2 1 (d)]*. Eq. (11) can be 
simplified by considering an m x m Hurwitz polynomial matrix E(d) solving the 
following right spectral factorization problem 



E*{d)E{d) = A* 2 (d)i> u A 2 (d) + B* 2 (d)^ x B 2 (d) 



(4.1-12) 



E(d) is then called a right spectral factor of the R.H.S. of (12). E(d) exists if and 
only if 



rank 



ipuA 2 (d) 
i> x B 2 (d) 



dim i 



(4.1-13) 



The spectral factors are determined uniquely up to an orthogonal matrix multiple. 
If E(d) and T(d) are two right spectral factors of the R.H.S. of (12), then 



r(d) = UE{d) 



(4.1-14) 



where U is an orthogonal matrix, viz. U'U = I m . In particular, if m = 1, the right 
spectral factor E(d) is a Hurwitz polynomial and U represents just a change of 
sign. 

We see that the R.H.S. of (12), call it M(d, <i -1 ), is a polynomial matrix in d and 
d~ x which, according to Sect. 3.1, can be considered as a two-sided matrix-sequence 
of finite length. Further, it is symmetric about the time 0, viz. M*(d,d~ x ) = 
M{d,d" 1 ). In the single input case, m = 1, it is a polynomial symmetric in d 
and d~ x . Hence, if d = a is a root, d — 1/a is a root as well. Further, since its 
coefficients are real, if d = a is a root, d = a* is also a root, a* denoting the complex 
conjugate of a. It follows that M(d, d" 1 ) has an even number of inverse/Hcrmitian 
symmetric complex roots. Each root on the unit disc must have an even multiplicity. 
Therefore, E(d) can be constructed by collecting all the (i-roots of M(d, d~ x ) such 
that \d\ > 1, along with every root such that \d\ = 1 with multiplicity equal to 
one half the corresponding multiplicity pertaining to M(d, d~ x ). 



Example 4.1-1 Let: 



We find 



1 
\ 



G ■■ 



A(d) 



H = [ 1 1 ] ; = 1 ; V>« 

B(d): 



1 - d 





H xy (d) = A~ 1 (d)B(d) 



d 

(1-d) 




d ' 











1 - d 



(4.1-15) 

(4.1-16) 
(4.1-17) 
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Hence, 



A 2 (d) = l-d B 2 {d) 



2d' 1 1 - -d + <T = -2d' 1 [d - - (d - 2) = 4 1 - 



(4.1-18) 



For the R.H.S. of (12) we find 



Hence, 



1 - 



E{d) = ±2 1 



Using (12) into (11), we obtain 



J = J l + J 2 



with 



Ji := (i*(d)l(d)) 
1(d) := E-*(d)B* 2 (d)il} x A- 1 (d)x(Q) + E(d)A 2 1 (d)u(d) 



x'(0)A-*(d) ^ x -^ x B 2 (d)E- 1 (d)E-*(d)B* 2 (d)^ 
A~ 1 (d)x(0) 



(4.1-19) 

(4.1-20) 

(4.1-21) 
(4.1-22) 

(4.1-23) 



Note that J 2 is not affected by u(d). Then, the problem amounts to finding causal 
input sequences u(d) minimizing (21). According to (3.1-12), (21) equals the square 
of the ^2 _ norm of the m-vector sequence £{d) in (22). In turn, (22) has two additive 
components. One, E(d)A2 l (d)u(d) is the d-representation of a causal sequence. 
The other results from prcmultiplying the ^-representation of the causal sequence 
7p x A- 1 (d)x{0) by E-*(d)B^(d) = [B 2 (d)E- 1 (d)]*. This, considering (3.1-33) and 
(3.1-34), can be interpreted as the (i-representation of a strictly anticausal sequence 
being E(0) nonsingular by Hurwitzianity of E(d). By (3.1-8), the first term on the 
R.H.S. of (22) can be thus interpreted as the (i-representation of a sequence ob- 
tained by convolving a causal sequence with a strictly anticausal sequence. Hence, 
the first additive term on the R.H.S. of (22) is a two-sided m-vector sequence. 

Taking into account the above interpretation in order to find the optimal causal 
input sequences u(d), we try to additively decompose £(d) in terms of a causal 
sequence £+(d) plus a strictly anticausal sequence £-(d): 



Since from (25), 
it follows that 

Consequently, with the decomposition (24) we would have 

j, = (t + (d)i+(d)) + (t (d)Md)> 



£(d) =£+{d)+£-{d) 


(4.1-24) 


ord£ + (d) > , ord t_(d) > 


(4.1-25) 


ord[r (d)£+(d)] > 


(4.1-26) 


(£*_(d)£ + (d)) = (£* + (d)£-(d)) = 


(4.1-27) 



(4.1-28) 



Since the two additive terms on the R.H.S. of (28) arc nonncgative, boundedness 
of Ji requires that each of them be such. Therefore, we must possibly insure that 
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in (24) £-(d) be an .^-sequence. Further, boundedness of (£* + (d)£ + (d)} will follow 
by restricting u(d) to be such as to make £+(d) an £ 2 -sequence. 

Main points of the section The polynomial equation approach to the steady- 
state deterministic LQOR problem amounts to finding a right spectral factor E(d) 
as in (12), and decomposing the two-sided sequence £(d) in (22) into two additive 
sequences of which one causal and the other strictly anticausal and possibly of finite 
energy. 

Problem 4.1-1 Consider (12) for a SISO plant (1) with transfer function H yu (d) = HB 2 (d)/A 2 (d) 
Assume that ip u > 0, ip y > and the polynomials HB 2 (d) and A 2 (d) have no common root on 
the unit circle. Show that the (two) spectral factors E(d) solving (12) arc strictly Hurwitz poly- 
nomials. [Hint: It is enough to check that E(d) has no root for |d| = 1. Note also that 
A*{e i0 )A{e^) = |A(e^)| 2 . ] 



4.2 Causal-Anticausal Decomposition 

It is convenient to transform the right spectral factorization (12) into an equation 

involving solely polynomial matrices in the indeterminate d. 

Let 

q := max{dA 2 (d), dB 2 (d)} (4.2-1) 
A 2 (d) := d q A* 2 (d) ; B 2 (d) := d q B 2 (d) ; E(d) := d q E*(d) (4.2-2) 
Then, (1-12) can be rewritten as 

E{d)E{d) = A 2 {d)i> u A 2 {d) + B 2 {d)i> x B 2 {d) (4.2-3) 
Likewise, the first additive term on the R.H.S. of (1-22) can be rewritten as 

£{d) := E' 1 (d)B 2 (d)^ x A- 1 (d)x(0) (4.2-4) 

Suppose now that we can find a pair of polynomial matrices Y and Z fulfilling the 
following bilateral Diophantine equation 

E(d)Y(d) + Z{d)A(d) = B 2 {d)4> x (4.2-5) 

with the degree constraint 

dZ(d) < dE(d) = q (4.2-6) 

Last equality follows from the fact that, being E(d) Hurwitz, E(0) is nonsingular. 
Using (5) in (4), we find 

£{d) := £+{d) + L(d) (4.2-7) 

where 

£+(d) := Y(d)A~ 1 (d)x(0) (4.2-8) 

£-(d) := E~ 1 (d)Z(d)x(0) (4.2-9) 

are, respectively, a causal and a strictly anticausal sequence, the latter possibly 
with finite energy. While causality of the first follows from causality of A^ 1 (d), 
strict anticausality and possibly finite energy of (d) is proved by showing that 

t_(d) = x'(0)Z*(d)E-*(d) (4.2-10) 
= x'(0) [Z> + Z[d- 1 + ■■■ + Z' dz d~ dz ] {idiE^dwy 1 

= 2/(0) [Z' + Z' x d- X + ■■■ + Z' dz d- dz ] d q E- 1 (d) 
= x'(0) Iz^d^-^ + ■■■ + Z' d q ] E-\d) 



GG 



Deterministic LQ Regulation - II 



where we set 

Z(d) = Z + Zid + ■ ■ ■ + Z dz d dz (4.2-11) 

From (10), we see that strict-causality and possibly finite energy of £*_(d) follows 
from (6) and Hurwitzianity of E(d). Should E(d) be strictly Hurwitz, finite energy 
of £-(d) would follow at once. 

In conclusion, provided that we can find a pair (Y(d), Z(d)) solving (5) with 
the degree constraint (6), a decomposition (1-24) is given by 

£+{d) = Y{d)A- 1 {d)x{{))+E{d)A- 2 1 {d)u{d) (4.2-12) 

£_{d) = E- 1 (d)Z(d)x(0) (4.2-13) 

Let 

J 3 = (£*_ (d)t_(d)) (4.2-14) 

With J 2 as in (1-23), assume that J 2 + J3 is bounded. Then, an optimal input 
sequence is obtained by setting £+{d) — O m , i.e. 

u(d) = -A 2 (d)E- 1 (d)Y(d)A- 1 (d)x(0) (4.2-15) 
= -A 2 (d)E~ 1 (d)Y(d) [x(d) - A~ 1 (d)B(d)u(d)] [(1-5)] 

Equivalcntly, 

u{d) = -M^\d)N 1 {d)x{d) (4.2-16) 

where, for reasons that will become clearer in the next section, we have introduced 
the following transfer matrices 

Mi(d) := E~ 1 (d)[E(d) -Y(d)B 2 (d)]A2 1 (d) (4.2-17) 
JVi(d) := E~ 1 (d)Y(d) (4.2-18) 

The following lemma sums up the results obtained so far. 
Lemma 4.2-1. Provided that: 
i. Condition (1-13) is satisfied; 

ii. Eq. (5) admits solutions (Y(d), z(d)) with the degree constraint (6); 

Hi. J2 + J3 is bounded; 

the LQOR problem has either open-loop solutions (15) or linear state-feedback 
solutions (16)-(18). 

Main points of the section The bilateral Diophantine equation (5) along with 
the degree constraint (6), if solvable, allows one to obtain a causal/strictly anti- 
causal decomposition of (4) and, hence, an optimal control sequence. 

4.3 Stability 

From Chapter 2 we already know that LQR optimality does not imply in gen- 
eral stability of the optimally regulated closed loop system. In Sect. 2 we have 
found that optimal LQOR laws can be obtained by solving a bilateral Diophantine 
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equation. Even if solvable, the latter need not have a unique solution. By impos- 
ing stability of the closed-loop system, we obtain, under fairly general conditions, 
uniqueness of the solution. To do this, we resort to Theorem 3.2-1. First, for M\(d) 
and Ni(d) as in (2-17) and, respectively, (2-18), 

M 1 {d)A 2 (d) + N 1 (d)B 2 (d) = 

= fT 1 (<*)[£(<*) - Y(d)B 2 (d)} - E~ 1 (d)Y(d)B 2 (d) 

= I m (4.3-1) 

Hence, (3.2-24) is satisfied. Then, internal stability is obtained if and only if both 
M\(d) and N\(d) are stable transfer matrices. We begin by finding a necessary 
condition for stability of M\{d). To this end, we write 

Mi(d) = E- 1 (d)E- 1 (d)\E(d)E(d) - E(d)Y(d)B 2 (d)\A 2 1 (d) 
= E- 1 (d)E- 1 (d){A 2 (d)xb u A 2 (d) + 

B 2 (d)i; x B 2 (d) - [B 2 (d)xb x - Z{d)A{d)]B 2 {d)}A 2 1 {d) 
= E-\d)E- 1 {d) \A 2 {d)i) u + Z{d)A{d)B 2 {d)A 2 1 {d) 
= E- 1 (d)E~ 1 (d)[A 2 (d)4>u + Z(d)B(d)\ [(1-9)] (4.3-2) 

where the second equality follows from (1-12) and (2-5). We note that, being E(d) 
Hurwitz, E(d) turns out to be anti-Hurwitz, viz. 

det.E(d) = => \d\<l (4.3-3) 

Then, a necessary condition for stability of M\{d) is that the polynomial matrix 
within brackets in (2) be divided on the left by E(d). I.e., there must be a poly- 
nomial matrix X(d) such as to satisfy the following equation 

E(d)X(d) - Z{d)B{d) = A 2 (d)ip u (4.3-4) 

Recalling (2-5), we conclude that, in order to solve the steady-state LQOR problem, 
in addition to the spectral factorization problem (1-12), we have to find a solution 
(X(d), Y(d), Z(d)) with dZ(d) < dE(d) of the two bilateral Diophantine equations 
(2-5) and (4). 
Using (4) in (2) we find 

Mi(d) =E- 1 (d)X(d) (4.3-5) 
This, along with (2-18), yields for (2-16) 

u(d) = -X- 1 (d)Y(d)x(d) (4.3-6) 

We then see that the Z(d) in (2-5) and (4) plays the role of a "dummy" polynomial 
matrix. By eliminating Z(d) in (2-5) and (4), we get 

X(d)A 2 (d) + Y(d)B 2 (d) = E(d) (4.3-7) 



Problem 4.3-1 Derive (7) from (2-5), (4) and (1.12), by eliminating the "dummy" polynomial 
matrix Z(d). 
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Problem 4.3-2 Show that a triplet (X(d), Y(d), Z(d)) is a solution of (2-5) and (4) if and only 
if it solves (2-5) and (7). [Hint: Prove sufficiency by using (2-3). ] 

It follows from (5) that X(d) is nonsingular. This can be seen also by setting 
d = in (7) to find X(0)A 2 (0) = E(0). In fact, recall that, by (9) and (7), 
B 2 (0) — O n xm- Since both ^(0) and E(0) are nonsingular, nonsingularity of 
X(0), and hence of X(d), follows. 

It also follows that X(d) and Y(d) are constant matrices. This is proved in the 
following lemma. 

Lemma 4.3-1. Let (2-5) and (4) [or (2-5) and (7)] have a solution (X(d), Y(d), 
Z(d)) with dZ < dE. Then X(d) = X and Y(d) = Y are constant matrices, viz. 
OX = dY = 0. 

Proof Consider (2-5). E(d) is a regular polynomial matrix, viz. the coefficient matrix of its highest 
power is nonsingular. Further, by (2-2), dE(d) = q. Then, it follows that d[E(d)Y(d)] = q+dY(d). 
Next, Z(d)A(d) = Z(d) - dZ{d)<5>. Hence, d[Z(d)A(d)] < dE(d) - 1 + 1 = q. Further, from (2-2) 
8B 2 (d) <q-l. Hence, d[B 2 (d)ip x ] < q - 1. Therefore, 

q + dY(d) = d[E(d)Y(d)\ 

= d{B 2 (d)4> x - Z(d)A(d)} 

< max jd^Wx], 9[Z(d)A(d)]| <q 

Hence, dY(d) = 0. 

Similarly, with reference to (4), 

q + dX(d) = d[E{d)X(d)\ 

= d[A 2 {d)ip u + Z(d)B{d)} 

< max jd[A 2 (d)V> u ], 9[Z(d)B(d)]| < q 

Hence, dX(d) = 0. 

From dX(d) = 0, it follows that X and Y are left coprimc. Then, using Theorem 
3.2-1, we find that the d-characteristic polynomial Xci(d) of the closed-loop system 
(1-9) and (6), with plant and regulator free of nonzero hidden eigenvalues, is given 

by 

Xd(d) = det£(d)/det£(0) (4.3-8) 
E(d) = XA 2 (d) + YB 2 (d) (4.3-9) 
The following lemma sums up the above results 
Lemma 4.3-2. Provided that: 
i. Condition (1-13) is satisfied; 

ii. Eq. (2-5) and (4) for (2-5) and (7)] admit a solution (X, Y, Z{d)) with dZ(d) < 
dE(d); 

Hi. J 2 + Jz is bounded; 

the LQOR problem is solved by the linear state-feedback law 

u(t) = -X~ l Yx{t) (4.3-10) 

where X and Y are the constant matrices in i., and, correspondingly, 

Jmin = Jl + J3 

Further, the optimal feedback system is internally stable if and only if the left spec- 
tral factor E(d) in (1-12) is strictly Hurwitz. 
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Main points of the section Stability of the optimally regulated system led us 
to consider a second bilateral Diophantine equation to be jointly solved with the 
one related to optimality. Internal stability is achieved if and only if the spectral 
factorization problem (1-12) yields a strictly Hurwitz spectral factor. 



4.4 Solvability 

It remains to establish conditions under which (2-5) and (3-4) are solvable. 

Lemma 4.4-1. Let the greatest common left divisors of A{d) = I — rf<f> and B{d) = 
dG be strictly Hurwitz. Then, there is a unique solution (X,Y, Z(d)) of (2-5) and 
(3-4) [or (2-5) and (3-7)] such that dZ(d) < dE{d). Such a solution is called the 
minimum degree solution w.r.t. Z(d). 

Proof Let D(d) be a greatest common left divisor (geld) of A(d) and B(d). Then, according to 
Appendix B, there exists a unimodular matrix P(d) such that 



A{d) -Bid) Pid) 



Pid) 



Pn(d) 
P2l(d) 



Did) O n> 

Piiid) 
P 2 2id) 



(4.4-1) 



(4.4-2) 



Consider now (2-5) and (3-4). Rewrite them as follows 

E(d)[Y(d) Xid) ] + Z(d) [ Aid) -Bid) ] = [ B 2 id)^ x A 2 id)^ u ] (4.4-3) 
Postmultiplying (3) by Pid), setting 

[ Yid) Xid) ] := [ Yid) Xid) ] Pid) (4.4-4) 

and using (1), we find 

Eid)Yid) + Zid)Did) = A 2 id)^ u P2iid) + B 2 id)i> x Puid) (4.4-5) 

Eid)Xid) = A 2 id)ip u P 22 id) + B 2 id)i> x P 12 (d) (4.4-6) 

Now, observe that 

O nxm = -B(d)P 22 id) + Aid)P 12 (d) (4.4-7) 
= -B(d)A 2 (d) + A(d)B 2 (d) 

where the first equality follows from (1) and (2), and the second from (1-9). Since A 2 id) and 
B 2 id) are right coprime, there is a polynomial matrix V~id) such that 

PMd) = B 2 id)Vid) P 22 id) = A 2 id)Vid) (4.4-8) 

Then, using (2-3), the R.H.S. of (6) can be rewritten as E id) E id)V id) . Hence, (6) reduces to 

Xid) = Eid)Vid) (4.4-9) 

Further, since Did) is strictly Hurwitz and Eid) is anti-Hurwitz, det G(d) and det Eid) are 
coprime. Then, if follows from Appendix C that (5) is solvable. If (YqMi Zo(d)) solves (5), all 
solutions of (5) are given by 



Yid) 
Zid) 



Y id) + Lid)D(d) 
Zoid) - Eid)Lid) 



(4.4-10) 
(4.4-11) 



with Lid) any polynomial matrix of compatible dimensions. Since Eid) is regular, the minimum 
degree solution w.r.t. can be found by left dividing Zoid) by Eid), viz., 



Zoid) = Eid)Qid) + Rid) 



(4.4-12) 
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with<9R(<2) < 8E{d). Then, (11) becomes Z(d) = R(d)+E(d)[Q(d)-L(d)]. Choosing L(d) = Q(d) 
we obtain the minimum degree solution w.r.t. Z(d) 



Y(d) = Y + Q(d)D(d) 
X(d) = E(d)V(d) 
Z(d) = R(d) 



(4.4-13) 



Hence 



with 



[ Y(d) X(d) ] P(d) = 

= [ Y (d) + Q(d)D(d) E(d)V(d) ] [(4)] 

= [ Y (d) E(d)V(d) ] +Q(d) [ D(d) O nxm ] 

= [ Y (d) E(d)V(d) } +Q(d) [ A{d) -B(d) ] P(d) [(1)] 



Y(d) = Y (d) + Q(d)A(d) 
X{d) = X (d) - Q(d)B{d) 
Z(d) = R{d) 



[ Y {d) X (d) ] := [ Y (d) E(d)V(d) ] p-^d) 
is the desired minimum degree solution w.r.t. Z(d) of (2-5) and (3-4). 



(4.4-14) 



(4.4-15) 



As pointed out in Sect. 3.1, the fact that the gcld's of A(d) and B(d) are strictly 
Hurwitz is equivalent to stabilizability of the pair (<&,£). Therefore, Lemma 1 is 
the counterpart of Theorem 2.4-1 in the polynomial equation approach. 

Example 4.4-1 Consider again the LQOR problem of Example 1-1. We see that the pair (<E>, G) 
is stabilizable, though not completely reachable. Consequently, A(d) = I — d<& and B(d) = dG 
have strictly Hurwitz gcld's. From (1-18) it follows that q = 1. Therefore, if we take (Cf. (1-19)) 
E(d) = 2(1 - d/2), we find 



A 2 (d) = d(l - d- 1 ) = -1 + d 
B 2 (d) = d[ d- 1 ] = [ 1 ] 
E(d) = 2<2(1 - d~ 1 /2) = -1 + 2d 



(4.4-16) 



X 



Y : 



We have to solve (2-5) and (3-4) with the degree constraint 

dZ(d) < dE(d) = 1 (4.4-17) 

We find that the minimum degree solution w.r.t. Z(d) of (2-5) and (3-4) is, in agreement with 
Lemma 1, unique and equals 

(4.4-18) 
(4.4-19) 

(4.4-20) 

Xd{d) = dct(/-d$ ci ) (4.4-21) 
d\ 2 

1 



Therefore (3-10) becomes 

u(t) = -X^Yxit) 

"[ \ \ ] *W 

Further, the transition matrix of the corresponding closed-loop system equals 



: <S> - GX~ X Y 



Therefore 



E{d) 
E(0) 



Hence, the closed-loop system is asymptotically stable. Further, last equality shows that the 
closed-loop eigenvalues are the reciprocal of the roots of the spectral factor E(d), together with 
the unreachable eigenvalue of ($, G). 
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Problem 4.4-1 Consider again the LQOR problem of Example 1-1 except for the matrix <E> 
whose (2,2)-entry is now 2 instead of 1/2. Consequently the pair (<1>,G) is not stabilizable. Show 
that: 

i. A'2(d), B2(d) and E(d) are again as in (1-18) and (1-19); 

ii. Eq. (2-5) and (3-4) have no solution. 

It is of interest to establish conditions under which the constant pair (X, Y) 
can be computed by using (3-9) alone. Whenever possible, this would provide 
the matrices that are needed to construct the state-feedback gain-matrix — X~ 1 Y, 
without the extra effort required to compute the "dummy" polynomial matrix Z(d). 

Lemma 4.4-2. Let the pair (Q,G) be completely reachable. Then, (3-9) admits 
a unique constant solution (X, Y) which is the same as the one yielded by the 
minimum degree solution w.r.t. Z(d) of (2-5) and (3-4) [or (2-5) and (3-9)]. 

Proof First, we show that, under the stated condition, (3-9) has a unique constant solution 
(X, Y). In fact, Lemma 1 guarantees that a constant solution of (3-9) exists. To see that it is 
unique, consider that any constant pair (X, Y) solves (3-9) if and only if it solves 

M [d)X' + B 2 (d)Y' = E(d) (4.4-22) 

Further, 

A- 1 (d)B 2 (d) = B(d)A^ 1 (4.4-23) 

with A(d) := dA*(d) = dl - and B(d) =_: dB*{d) = G' . Since ($, G) is completely reachable, 
it follows from PBH reachability test that A(d) and B(d) are right coprime. In addition, A(d) is 
regular. It then follows that (22) has a unique solution with dY' < dA(d) = 1. 

Problem 4.4-2 Show that if in (22) Y' is a constant matrix, X' is also a constant matrix. 
[Hint: Recall that dB2(d) < dA2(d) = 8E(d) = q and A2(0) and E(0) are nonsingular. ] 

Problem 4.4-3 Consider the LQOR problem of Example 1-1 where (&,G) is not completely 
reachable. Show that (3-9) does not have a unique constant solution. 

Problem 4.4-4 Consider the LQOR of Example 1-1. Check whether it is possible to use (3-9) 
only to find the constant matrices X and Y. 



Theorem 4.4-1. Let ($, G) be a stabilizable pair, or, equivalently, A(d) := I — 
and B(d) := dG have strictly Hurwitz gcld's. Let (1-13) be fulfilled. Let (X, Y, Z(d)) 
be the minimum degree solution w.r.t. Z{d) of the bilateral Diophantine equations 
(2-5) and (3-4) [or (2-5) and (3-7)]. Then, the constant state-feedback control 

u(t) = -X^Yxit) (4.4-24) 

makes the closed-loop system internally stable if and only if the spectral factor E(d) 
in (1-12) is strictly Hurwitz. Ln such a case, (24) yields the steady-state LQOR 
law. Lf{<fr,G) is controllable, the d- characteristic polynomial Xci(d) of the optimally 
regulated system is given by 

, n det E(d) 

- dc^oy (4 - 4 " 25) 

Finally, i/(<f>,G) is a reachable pair, the matrix pair {X, Y) in (24) is the constant 
solution of the unilateral Diophantine equation (3-7). 

It is to be pointed out that Theorem 1 holds true for every tp u = t])' u > 0. 
If tp u is positive definite, E(d) turns out to be strictly Hurwitz and the involved 
polynomial equations solvable if ($, G, H) is stabilizable and detectable [CGMN91]. 
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This result agrees with the conclusions of Theorem 2.4-4 obtained via the Riccati 
equation approach. 

Main points of the section Solvability of the two linear Diophantine equations 
relevant to the LQOR problem is guaranteed by stabilizability of the pair ($, G). 
Strict Hurwitzianity of the right spectral factor E(d) yields internal stability of 
the optimally regulated closed-loop system. Whenever ($, G) is a reachable pair, 
the steady-state LQOR feedback-gain can be computed via a single Diophantine 
equation. 

Problem 4.4-5 (Stabilizing Cheap Control) Consider the polynomial solution of the steady-state 
LQOR problem when the control variable is not costcd, viz. tp u = O mX m.- The resulting regulation 
law will be referred to as Stabilizing Cheap Control since the polynomial solution insures closed- 
loop asymptotic stability if no unstable hidden modes are present and E(d) is strictly Hurwitz. 
Find the Stabilizing Cheap Control for the plant of both Problems 2.6-5 and 2.6-6. Finally, 
draw general conclusions on the location of the eigenvalues of SISO plants, either minimum or 
nonminimum-phase, regulated by Stabilizing Cheap Control. Contrast Stabilizing Cheap Control 
with Cheap Control. 

4.5 Relationship with the Riccati-Based Solution 

In order to find a direct relationship between the polynomial and the Riccati- 
based solutions we proceed as follows. Assume that ($, G, H) is stabilizable and 
detectable. Let P be the symmetric nonnegative definite solution of the ARE 
(2.4-57). Then, for every x(0) € R™, we have 



x'{0)Px{0) = J 2 + J3 



or 



P = (A-*(d)\^ x -i/j x B 2 (d)E- 1 (d)E-*(d)B* 2 (d)iP x + 
A*(d)Z*(d)E-*(d)E- 1 (d)Z(d)A(d)]A- 1 (d)) 



Letting. 



E~*{d)B* 2 {d)^ x = E-\d)B 2 {d)i, x 

= Y + E- 1 (d)Z(d)A(d) 



[(2-5)] 



we get 



P 



(A-*(d){il> x - Y'Y)A- 1 {d) - d- q A-*{d)Y'E~*(d)Z(d) - 

d q Z*(d)E-\d)YA-\d)) 

{A-*{d)^ x -Y'Y)A-\d)) 



OC 



( J2 [d- r d k <5>' r {ijj x - Y'Y)<$> k }) 



r,k=0 



oc 



^V fe (^ -Y'Y)$ k 



k=0 



$'P$ _ y'Y + iJj : 



(4.5-1) 
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In (1) the second equality follows since d~ q E~*(d)Z(d) = E~ 1 (d)Z(d) is strictly 
anticausal and A~*(d)Y' is anticausal; the third recalling (3.1-17); the fifth by a 
formal identity (Cf. also Problem 2.4-1). Comparing (1) with the ARE (2.4-57), 
we find 

Y'Y = <f>'PG(ip u + G'PG^G'P® (4.5-2) 
Further, comparing (2.4-60) with (4-24), find 

{ip u + G'PG^G'P® = X~ X Y (4.5-3) 

Let 

Y'Y = Y'X'~ 1 X'Y 

= (X^Y)' X' XX~ X Y 

Taking into account (2) and (3), get 

X'X = tp u + G'PG (4.5-4) 

From (4) and (3), it follows that 

X'Y = G'P<S> (4.5-5) 

Finally, 

Z(d) = B 2 (d)P (4.5-6) 
To establish (6) we can proceed as follows. Rewrite (3-7) as 

25(d) = A 2 (d)X' + B' 2 Y' 

Using this into (2-5), get 

Z(d)A(d) = B 2 (d)(yj x -Y'Y)-A 2 (d)X'Y 

= B 2 (d)(P - $'P$) - A 2 {d)G'P§ [(1) & (5)] 

Recalling that A(d) — I — d&, we have 

Z(d)-B 2 (d)P = {dZ{d)-[B 2 {d)& + A 2 {d)G']P}<5> 

= d[Z{d) - B 2 {d)P]§ (4.5-7) 

Last equality follows since 

B 2 (d)& + A 2 (d)G' = [<S>B 2 {d) + GA 2 {d)]*d q 

= [d- 1 B 2 {d)]*d q [A(d)B 2 (d)=dGA 2 (d)] 
= dB 2 (d) 

Eq. (7) can be rewritten as follows 

[Z(d) - B 2 (d)P]A(d) = O mxn 

This yields (6). 

Main points of the section Eqs. (l)-(2), and (4)-(6) give the relationship 
between the polynomial and the Riccati-based solutions of the steady-state LQOR 
problem. 
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Figure 4.6-1: Plant/compensator cascade unity feedback for an LQ regulated 
system. 



4.6 Robust Stability of LQ Regulated Systems 

We shall use the results of Sect. 3.3 to analyze robust stability properties of op- 
timally LQ output regulated systems. In this respect, the first comment is that, 
in view of Theorem 3.3-1, stability robustness is guaranteed from the outset by 
the state-feedback nature of the LQOR solution, and asymptotic stability of the 
nominal closed-loop system. Nevertheless, we intend to use also Fact 3.3-1 so as 
to point out the connection in LQ regulated systems between robust stability and 
the so called Return Difference Equality. 

From the spectral factorization (1-12) and (3-9) we have 

ij u + A 2 *(d)B* 2 (d)^ x B 2 (d)A 2 1 (d) = 

[I m + A 2 *(d)B* 2 (d)Y'X'- 1 ]X'X[I m + X- 1 YB 2 {d)A 2 -\d)] 
Recalling (1-9) and (4-32), the above equation can be rewritten as follows 

^ + P* u {d)^P*u{d) 

[Im + K hQ P xu {d)]*{tp u + G'PG)[I m + K LQ P xu (d)] 

where 

K LQ := X- X Y (4.6-2) 

denotes the constant transfer matrix (state-feedback gain matrix) of the LQOR 
in the plant/compensator cascade unity feedback system as in Fig. 1. Eq. (1) is 
known as the Return Difference Equality of the LQ regulated system. Similarly to 
(3.3-4), we denote the loop transfer matrix of the plant/compensator cascade of 
Fig. 1 by 

g hQ {d) :=K hQ P xu {d) (4.6-3) 

and the corresponding return difference by I m + GhQ(d). At the light of Sect. 3.3, 
we interpret <5lq(c0 as the nominal loop transfer matrix. In fact, we suppose 
that -KTlq has been designed based upon the nominal plant transfer matrix P xu (d). 
Similarly to (3.3-6), we assume that the actual plant transfer matrix equals P xu (d) 
postmultiplicd by a multiplicative perturbation matrix M(d). Consequently, the 
actual loop transfer matrix Q(d) equals 

6{d) = GL Q {d)M{d) (4.6-4) 

In order here to use the robust stability result in Fact 3.3-1, we exploit (1) so as 
to find a lower bound for the R.H.S. of (3.3-10). We note that for d taking values 



(4.6-1) 
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on the unit circle of the complex plane, H*(d) is the Hermitian of H(d), whenever 
the latter is a rational matrix with real coefficients. Consequently, for d G (C and 
\d\ = 1, H* (d)H(d) is Hermitian symmetric nonnegative definite. The other point 
that we shall use to lower bounding the R.H.S. of (3.3-10) via (1), is that Theorem 
2.4-4 insures that the matrix P in (1) is a bounded symmetric nonnegative definite 
matrix, provided that the (<&, G, H) is stabilizable and detectable. Hence, under 
such conditions there exists a positive real (3 2 such that 

V>« < *l>u + G'PG < f3 2 I m . (4.6-5) 

Consequently, remembering the definition of the contour f2 in(D given after (3.3-8), 
we find from (1), (2) and (3) for d G n 

P 2 [im + gi Q (d)}[i m + g LC >(d)} > 

ipu + P: u (d)iP x P xu (d) > i/> u (4.6-6) 

Hence 

<L(!m + GL Q (d)) > X ™^ u) =: a < 1 (4.6-7) 

Proposition 4.6-1. Consider the steady-state LQ output regulated system of Fig. 1, 
where ($,G, H) is stabilizable and detectable and ip u > 0. Then, there exists a pos- 
itive real a < 1 lower bounding the minimum singular value of the return difference 
matrix as in (7). 

We note that the bound a in (7) is in general quite conservative. In [Sha86] a 
sharper bound is given. Our interest in (7) is that it shows that if an LQ regulator 
is designed on the basis of a nominal pair (<i>, G), then the corresponding nominal 
return difference I m + GhQ(d) has its smallest singular value lower bounded by 
a > 0. Consequently, according to Fact 3.3-1, this LQ regulator will be capable of 
stabilizing all plants in a neighborhood of the nominal one. 

Theorem 4.6-1. With reference to the notations in Fact 3.3-1, let: 

• Xoi(d) an d X° t{d) have the same number of roots inside the unit circle; 

• Xoi(d) and X°ot{d) have the same unit circle roots; 

• ($, G, H) be stabilizable and detectable and ip u > 0; 

then the steady- state LQ output regulated system designed for the nominal pair 
($, G) remains asymptotically stable for all plants such that at each d G ft 

a(L- 1 (d)-I m )<a (4.6-8) 

with a > as in (7). 

Main points of the section The return difference equality of LQ regulation al- 
lows us to lower bounding away from zero the minimum singular value of the return 
difference of the nominal loop and hence to show robust stability of LQ regulated 
systems against plant multiplicative perturbations. Alternatively, stability robust- 
ness of LQ regulated systems follows at once from the state-feedback nature of the 
LQOR solution, and asymptotic stability of the corresponding nominal closed-loop 
system. 
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Notes and References 

Appendix C gives a quick review of the results on linear Diophantinc equations 
used throughout this chapter. The polynomial equation approach to LQ regulation 
was ushered by the fundamental monograph [Kuc79]. See also [Kuc91]. 

The deterministic LQOR problem in the discrete- time case seems to have been 
first directly tackled and constructively solved in its full generality by polynomial 
tools in [MN89]. Earlier related results also appeared in [Kuc83] and [Gri87]. 

The polynomial solution of the deterministic LQOR problem in the continuous- 
time case appeared in [CGMN91]. The dual problem of stochastic linear minimum- 
mean square error state- filtering, i.e. Kalman filtering, was also solved in its full 
generality by polynomial equations in [CM91]. Earlier results on the subject were 
reported in [Kuc81] and [Gri85]. 

Spectral factorization is fundamental in LQ optimization, e.g. in finding polyno- 
mial solutions to optimal filtering problems [CM91]. For a discussion on the scalar 
factorization problem the readers is referred to [AW84]. In [Kuc79] and [JK85] 
algorithms are described that are applicable to spectral factorization of matrices. 
Spectral factorization was introduced by [Wie49]. In [You61] a spectral factor- 
ization for rational matrices was developed. See also [Kai68]. [And67] showed 
how spectral factorization for rational matrices can be computed using state-space 
methods, by solving an ARE. 

For continuous-time plants, many robustness results are available [AM90] , [Pet89] , 
[POF89]. These results in general cannot be easily extended to the discrete-time 
case. For the latter, see [MZ88], [GdC87] and [NDD92]. 
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DETERMINISTIC RECEDING 
HORIZON CONTROL 



Receding Horizon Control (RHC) is a conceptually simple method to synthesize 
feedback control laws for linear and nonlinear plants. While the method, if desired, 
can be also used to synthesize approximations to the steady-state LQR feedback 
with a guaranteed stabilizing property, it has extra features which make it par- 
ticularly attractive in some application areas. In fact, since it involves a horizon 
made up by only a finite number of time-steps, the RHC input can be sequen- 
tially calculated on line by existing optimization routines so as to minimize a per- 
formance index and fulfill hard constraints, e.g. bounds on the input and state 
time-evolutions. This is of paramount importance whenever the above mentioned 
constraints are part of the control design specifications. In contrast, in the LQ 
control problem over a semi infinite horizon hard constraints cannot be managed 
by standard optimization routines. Consequently, in steady-state LQ control we 
are forced to replace, typically at a performance degradation expense, hard with 
soft constraints, e.g. instantaneous input hard limits with input mean-square up- 
perbounds. 

Clearly RHC is most suitable for slow linear and nonlinear systems, such as 
chemical batch processes, where it is possible to solve constrained optimization 
control problems on line. Another direction is to use simple RHC rules which 
yield easy feedback computations while guaranteeing a stable closed-loop system 
for generic linear plants or plants satisfying crude open-loop properties. In view of 
adaptive and self tuning control applications, in this chapter we focus our attention 
mainly on the latter use of RHC. 

After a general formulation of the method in Sect. 1, specific receding horizon 
regulation laws are considered and analysed in Sect. 2 to Sect. 7. In Sect. 8 it is 
shown how the results of the previons sections can be suitably extended, by adding 
a feedforward action to the basic regulation laws, so as to cover the 2-DOF tracking 
problem. 

5.1 Receding Horizon Regulation 

Considering the results in Chapter 2 and Chapter 4, steady-state LQR can be 
looked at as a methodology for designing state-feedback regulators capable of sta- 
bilizing arbitrary linear plants while optimizing an engineering meaningful pcrfor- 
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mance index. However, we have seen in Sect. 2.6 and Sect. 2.7 that two simplified 
variants of LQR, viz. Cheap Control and Single Step Regulation, are severely lim- 
ited in their applicability. Consequently, at this stage it appears that, for LQR 
design, solving cither an ARE or a spectral factorization problem is mandatory in 
that the easier ways of feedback computation, pertaining to either Cheap Control 
or Single Step Regulation, arc in general prevented. Now, solving an RDE over a 
semi infinite horizon, or an ARE, typically entails, as seen in Sect. 2.5, iterations. 
These, in the RDE case, are slowly converging, while the Klcinman algorithm for 
solving the ARE must be imperatively initialized from a stabilizing feedback, possi- 
bly, for fast convergence, not far from the optimal one. Further, the latter algorithm 
involves at each iteration step a rather high computational effort for solving the 
related Lyapunov equation (2.5-3). Comparable computational difficulties are as- 
sociated with the spectral factorization problem, particularly in the multiple input 
case. 

Receding Horizon Regulation (RHR) was first proposed to relax the computa- 
tional shortcomings of steady-state LQR. In RHR the current input u(t) at time t 
and state x is obtained by determining, over an N steps horizon, the input sequence 
U [t,t+N) optimal in a constrained LQR sense, and setting u(t) = u(t), the whole 
procedure being repeated at time t + 1 to select u(t + 1). Accordingly, at every 
time the plant is fed by the initial vector of the optimal input sequence whose sub- 
sequent N — 1 vectors are discarded. The applied input would be optimal should 
the subsequent part of the optimal sequence be used as plant input at the subse- 
quent N — 1 steps. Since this is not purposely done, RHR can be hardly considered 
"optimal" in any well defined sense. Nevertheless, if the constraints are judiciously 
chosen, RHR can acquire attractive features. Let us consider the following as a 
formal statement of RHR. 

Receding Horizon Regulation (RHR) Consider the time-invariant linear 
plant 

x(t) = ieR" \ (5.1-1) 

y(k) = Hx(k) J 

with u(k) € R m and y(k) € R p . Define the quadratic performance index 
over the iV-steps prediction horizon [t + 1 , t + N] 

t+N-l 

£ e(k,x(k),u(k)) + \\<t + N)\\l x{N) (5.1-2) 
fc=t 

£(k,x,u) = \\x\\l x{k) + \\u\\l u{k) (5.1-3) 

with V*(fc) = H'^ y {k)H, ip v (k) = %(k) > 0, Mk) = i>' u (k) > 0, and 
tp x (N) = ip' x (N) > 0. Find, whenever it exists, an optimal input sequence 
U [t.t+N) t° the plant (1) minimizing (2) while satisfying the following set of 
contraints 

t+N 



k=t 



Xi{k)x{k) + Ui{k)u{k) I = \d (5.1-4) 



where: i = 1, 2, • • • , /; Xi(k), Ui(k) are matrices and Cj vectors of compatible 
dimensions; and the inequality /equality sign indicates that either the first or 
the latter applies for the i-th constraint. The RHR input to the plant is then 
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chosen at time t to be u(t) — u(t). In case Uhj+n) can be found in an explicit 
open-loop form as follows 

u(t + i) = f(i,x) , i= 0,1,- --,N- 1 (5.1-5) 

The time-invariant state-feedback given by 

u{t) = f(0,x(t)) , yte'E (5.1-6) 

is referred to as the RHR law relative to (l)-(4). 



Example 5.1-1 Consider the problem of finding an input sequence 

u™- 1 := [ u(n- 1) •■■ u(0) ]' 

which transfers the initial state x(0) = x of (1) to a target state x(n) at time n = dim$. Here 
we assume that the plant has a single input, u(k) £ R, and is completely reachable. We have 

x(k) = ^x + RkU^ 1 (5.1-7) 

for fc = 0,1,2,- - • , with 

R k :=[G $G ■■• J'-'Gjer^, (5.1-8) 

Since the reachability matrix of (1) 

R--R n = [G $G ••• $™-!G ] (5.1-9) 

is nonsingular, the solution is uniquely given by 

u™- 1 = fl-^xfa) - ^ n x] (5.1-10) 

If the terminal state x(n) is constrained to equal O n , we find 

u™- 1 =-R- 1 <3>"x (5.1-11) 

This specifies the open-loop input sequence (5) when: in (l)-(4) N = n; m = 1; (4) reduces to 
x(ri) = O n ; and (2) can be any being ineffective. The RHR law (6) becomes 

«(*) = -^R^^xit) (5.1-12) 

where e n is the n-th vector of the natural basis of R™ 

e n = [ • ■ ■ l]eE" 

In its general formulation (l)-(6), RHR has to be regarded as a problem of Convex 
Quadratic Programming which can be tackled with existing software tools [BB91]. 
In general, this is a quite formidable problem from a computational viewpoint, 
particularly if on line solutions are required. Nevertheless, RHR possesses such 
potential favorable features to even justify a significant computational load. Among 
these features, we mention the capability of RHR of combining, thanks to the 
presence of suitable constraints, short term behaviour with long-range properties, 
e.g. stability requirements. There arc, however, RHR laws both computable with 
moderate efforts and yielding attractive properties to the regulated system which 
can be expressed in an explicit form. They will be the main subject of the remaining 
part of this chapter. 

We warn the reader from believing that in general the function f(i, x) in (5) can 
be obtained in a closed analytic form. In fact, this happens only in specific cases, 
e.g. Example 1. Whenever f(i,x) cannot be found, one is forced to solve (l)-(5) 
numerically, to apply the input (6) at time t, and to repeat the whole optimization 



80 



Deterministic Receding Horizon Control 



procedure (l)-(5) over the next prediction horizon [t + 2, t + N + 1] so as to find 
numerically the input at time t + 1. 

Before progressing, it is convenient to point out that Cheap Control (Sect. 2.6) 
and Single Step Regulation (Sect. 2.7) can be embedded in the RHR formulation. 
For both of them the set of constraints (4) is void, and the prediction horizon 
is made up by the shortest possible prediction horizon, a single step. As was 
remarked, the resulting regulation laws are unacceptable in many pratical cases 
mainly because of their unsatisfactory stabilizing properties. In this respect, a 
definite improvement can be obtained by enlarging the prediction horizon and/or 
adding suitable constraints. In order to gain some understanding on how to ra- 
tionally make this selection, we introduce next a tool for analysing the stabilizing 
properties of some RHR laws. 

Main points of the section Though RHR was first introduced to lighten the 
computational load of steady-state LQR, in its general formulation it is a Convex 
Quadratic Programming problem with an associated high computational burden. 
Nevertheless, a judicious choise of the RHR design knobs can lead to regulation 
laws both computable with moderate efforts and yielding attractive properties to 
the regulated system. 



5.2 RDE Monotonicity and Stabilizing RHR 

A convenient tool for studying the stabilizing properties of some RHR schemes is 
the Fake Algebraic Riccati Equation (FARE) introduced in Problem 2.4-12. The 
FARE argument we shall mostly use is here restated. 



FARE Argument Consider a stabilizablc and detectable plant 

($, G, H) and the related LQOR problem of Sect. 2.4. Let P(k) be the 
solution of the relevant forward RDE 

P(k + l) = &P{k)$-&P{k)GU u + G'P(k)G~\ G'P{k)$ + ip x (5.2-1) 

initialized from some P(0) = P'(0) > 0. Then, 

P(k + 1) < P(k) (5.2-2) 
implies that the state-feedback gain matrix 

F(k + 1) = - [V>« + G'P{k)G~\ 1 G'P{k)<5> (5.2-3) 
stabilizes the plant, viz. $ + GF(k + 1) is a stability matrix. 



If to a given RHR problem we can associate a suitable RDE (1) whose solution 
satisfies (2), asymptotic stability of the regulated plant follows, whenever the state- 
feedback has the form (3). 

Other useful features of the RDE are its monotonicity properties, viz. 



P(k + l)<P(k) => P(k + 2) < P(k+ 1) 
P(k+l)>P(k) => P(k + 2) > P(k+ 1) 



(5.2-4) 
(5.2-5) 
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Eq. (4) tells us that, if F(k + 1) can be proved, via the FARE argument, to be 
a stabilizing state-feedback, F(k + 2) is stabilizing as well. Conversely, (5) shows 
that if F(k + 1) cannot be proved to be stabilizing via the FARE argument, the 
same is true for F(k + 2). 

The above monotonicity properties (4) and (5) can be proved by using the next 
result, given in [dS89]. 

Fact 5.2-1. Let P 1 (k) and P 2 (k) denote the solutions of two forward RDEs with 
the same ($,G) andip u , but possibly different initializations P 1 (0) and, respectively, 
P 2 (0), and different weights ip x : and ip 2 respectively. Let: 

P{k) := P 2 (k)-P 1 (k) ; 

^:=V'-^; $*~il) n + G'P\k)G\ 
$(&) :=^-Gi>- 1 G'P 1 (k)^ 
Then, P(k) satisfies the forward RDE equation 

P(k+l) = $'(fc)P(jfe)$(fc) - (5.2-6) 
&(k)P(k)G[l> u (k) + G'P{k)G~\ ^G'P(k)^(k) + j> x 

Proposition 5.2-1. Let P(k) be the nonnegative definite solution of the RDE (1). 
Then: 

i. Lf P{k) is monotonically nonincreasing at the step k, i.e. P(k + 1) < P(k), 
it is monotonically nonincreasing for all subsequent steps, i.e. P(k + i + 1) < 
P(k + i) for all i > 0; 

ii. If P(k) is monotonically nondecreasing at the step k, i.e. P(k + 1) > P(k), 
it is monotonically nondecreasing for all subsequent steps, i.e. P(k + i+ 1) > 
P(k + i) for all i > 0; 

Proof It is directly based on (6). 

i. Let P 2 (k) := P(k) and P 1 (k) := P(k + 1). Then, P(k) := P 2 {k) - P 1 (k) = P'(fc) > 0. 
Further, P(k + 1) satisfies (6) with ip x = 0. Then, it follows from Theorem 2.3-1 that 
P(k + 1) > 0. Therefore, assertion i. can be proved by induction. 

ii. Let P 2 (k) := P(k + 1) and P 1 (k) := P(k). Then, P(k) := P 2 (k) - P x (k) = P'(fc) > and 
we can use the same argument as in i. to prove assertion ii. 

In order to exploit Proposition 1 in the FARE argument, the easiest idea that comes 
to one's mind is to leave the constraint set (1-4) void, and to choose P(0) = ^(iV) 
very large, e.g. -P(O) = rl n , with r positive and very large. In this way one might 
think that a single iteration of the RDE (1) would yield -P(l) < -P(O). Next problem 
shows that this conjecture is generally false. 

Problem 5.2-1 Show that in the single input case if P(0) = rl n , r\\G\\ 2 2> ipu, and $ is 
nonsingular, the RDE (1) gives 



P(l) - P(0) ~ r 



■ 1 IIGI 10 ' 



■ ip x (5.2-7) 



where <E> -T := (<1>') _1 . 

Next, verify that, while for $ scalar P(0) — P(l) + ipx = r > 0, for the following second order 
system 



1 
a 10 



G - 



(5.2-8) 
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P(0) — -P(l) given by (7) is not nonnegative definite, irrespective of r, a and %j) x . 
Finally, show that in the case (8) 

* + GF(l) = $ - G\^ u + G'P(0)g] _1 G'P(0)<I> 
is not a stability matrix. 

The above discussion indicates that in order to possibly exploit the FARE argument 
in RHR we have to introduce some active constraint in (l)-(4). In the next section 
it will be shown that this is the case if the terminal state x(N) is constrained to 
O n and N is made larger than dim<I>. 

Main points of the section The FARE argument provides a sufficient condi- 
tion for establishing when in an LQOR problem the state-feedback gain matrix 
computed via an RDE stabilizes the closed loop system. 



5.3 Zero Terminal State RHR 

We show that RHR yields an asymptotically stable closed-loop system, provided 
that the prediction horizon is long enough and a zero terminal state constraint is 
used. In this connection, we first discuss in the next example how state-deadbeat 
can be obtained via RHR for single-input completely reachable plants. 

Example 5.3-1 Consider again the problem of Example 1-1 and the related zero terminal state 
RHR law (1-12) 

u(t) = -e^P" 1 *"^*) (5.3-1) 

We show that (1) gives rise to a state-deadbeat closed— loop system, i.e. a system with closed- 
loop characteristic polynomial Xcl( z ) = z n . In order to establish this property, we make use of 
Ackcrmann's formula [Kai80]. This states that, if (1-1) is a single— input completely reachable 
plant, the state-feedback regulation law u(t) = Fx(t) needed to get a closed-loop characteristic 
polynomial Xcl( z ) equals 

n(t) = -e'^Xd^Mt) (5.3-2) 
Then, that (1) yields state-deadbeat immediately follows from (2). 

We now turn to consider the more general case of MIMO plants when in (1-1)— (1- 
6): "4> y (k) = tp y — ip' y > 0; ip u (k) = ip u = ip' u > 0; the prediction horizon length N 
is arbitrary; and the constraints reduce to the zero terminal state. Specifically, we 
shall consider the following version of (1-1)— (1-5). 



Zero Terminal State Regulation Consider the problem of finding, 
whenever it exists, an input sequence U[t,t+jv) 

u(t + N -k) = F(k)x(t) k = l,---,N (5.3-3) 

to the plant (1-1), minimizing the performance index 

N-l 

E [\\y(t + i)Wl y + \Ht + i)\\k] (5-3-4) 

i=0 

under the zero terminal state constraint 



x(t + N)= O 



(5.3-5) 
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We shall prove the following classic result on the stabilizing properties of the RHR 
law based on zero terminal state regulation. 

Theorem 5.3-1. Consider the zero terminal state regulation (3)-(5) with ip u > 0. 
Let the plant (1-1) be completely reachable. Then the feedback RHR law 

u(t) = F(N)x(t) (5.3-6) 

exists and stabilizes (1-1) under the following conditions. 

Case a) ip y = O pxp : 

N>n (5.3-7) 
Case b) ip y > : The plant (1-1) is detectable, $ is nonsingular and 

N>n+l (5.3-8) 

Further, for single-input completely reachable plants, irrespective of <fr, tp y and ip u , 
(6) yields state- deadbeat regulation whenever 

N = n (5.3-9) 

The results (7) and (8) can be made sharper by replacing the plant dimension 
n with the reachability index v, v < n, of the pair ($, G). 

The state deadbeat regulation property has been constructively proved in Ex- 
ample 1. That in Case a) under (8), (6) is a stabilizing regulation law, is shown in 
[Kle74] for an invcrtiblc <f> and for any square $ in [KP75] for the single-input case 
and in [SE88] for the multi-input case. Our interest in Case a) is quite limited, 
the main reason being that Case b) gives us some extra freedom which can be con- 
veniently exploited for regulation design. Hence, the interested reader is directly 
referred to [Kle74], [KP75] and [SE88] for details on the proof of Case a). On the 
contrary, the proof of Case b) is next reported in depth by following the approach 
of [BGW90], based on the FARE argument. This proof is based on an alternative 
form for the RDE which requires nonsingularity of The extension of Case b) 
to possibly singular state-transition matrices will be dealt with before closing the 
section. 

Intuitively, (5) can be implicitly embodied in the RHR formulation 
(1-1)— (1-6) by setting tp x (N) — rl n with r — > oo and (1-4) void. This amounts 
to using the Dynamic Programming formulae (2.4-9)-(2.4-15) with such a tp x (N), 
or more precisely, 

P- I (0) = nx „ (5.3-10) 

In order to directly exploit (10), we reexpress (2.4-9)-(2.4-15) in terms of an itera- 
tive equation for P^ 1 (k). To this end, it is convenient to set 

U(k + l):=P{k)-P(k)GU u + G'P{k)G\ ^ G 'P(k) (5.3-11) 

and 

fl(k) —n- 1 ^) (5.3-12) 
The relevant results are summed up in the next lemma. 
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Lemma 5.3-1. Assume that the state-transition matrix <I> of the plant (1-1) be 
nonsingular. Then, whenever P^ 1 (k) exists, we have 

fl(k + l) = p-\k) + G^- 1 G' (5.3-13) 

Further, fl(k), k = 0, 1, 2, • • satisfies the following forward RDE 

n(k + i) = ^-^(k)^- 7, - $- 1 n(fc)$- T vJ /2 x 

i p + ^/ 2 *- 1 fi(fc)$- T ^ /2 ] ~Vi /2 * _1 fi(*)*" T + 

G^G' (5.3-14) 

Finally, provided that Ct(k + 1) is nonsingular, the LQOR feedback-gain matrix 
F(k+ 1), as in (2.4-10), can be expressed as follows 



F(k + l) = - ip u + G'P(k)G G'P(k)<f> 
= -^G'Q-^k + l)® 



(5.3-15) 



In (14): $- T := (S')" 1 ; := ^ /2 #5 €' 2 ■= (V4 /2 )'; and is the 

square-root of ip y [GVL83], ip y = {^hj 2 ) 2 . 

To prove Lemma 1 we shall make repeated use of the following result. 

Matrix Inversion Lemma Let A, C and DA~ 1 B + C~ 1 be nonsingular matrices. Then 



A + BCD ) 



A' 1 - A- l B{ DA- X B + C 



DA' 1 



(5.3-16) 



(5.3-17) 



Proof Let 

C~ x x = Du 1 

y = Bx + Au J 

with u, x and y vectors of suitable dimensions. Then, we have y = (A + BCD)u. Consider the 
"inverse" system of (17) 

C~ x x = DA~ 1 (y-Bx) 

u = -A~ 1 Bx + A~ 1 y 



from which we get u = A~ x y - A~ 1 B (DA^B + C^ 1 ) 1 DA' 



y- 



Proof of Lemma 3-1 Using (16), we establish (13) 

n(k + i) = n _1 (fc + i) 

= P- I (k)^ p- 1 (k)[-P(k)G] x 

|G'P(fe)p- 1 (fe)[-P(fc)G] + G'P(k)G + V> u j G'P(fe)p- 1 (fc) 

Next, from (11) it follows that 

P(k) = *'ll(fc)* + ip x 

Hence 

W(k) := P _1 (ifc) 



(5.3-18) 



$'n(fc)$ + Va 



n(fe) + $- T Vx* _1 



•j-- 1 -) n _1 (fc) - n- 1 (fe)<i>- T vJ /2 x 



i P + ^ 1 x /2 ^- 1 n-\k)^- T ^ /2 
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Then, by (13), (14) follows. 
Finally, 



F(fc + 1) 



V> u + G'P(k)G 



G'P(k)* 
l 

G'W- 1 (k)<S> 



[(18)] 



Vv 1 G'-G' 



G</v X G' + W(k) 



-i/}- 1 |g' -G'fi-^fc + l 



n(fc + 1) - iy(fc) 



[(16)] 
[(13)] 



which yields (15). 

We note that (14) is the forward RDE associated with the following LQOR problem: 



£(t + i) = $- T e(t) + $- T ^J /2 Kt) 

7(*) = G'm 
*(£,») - ll7(t)IIJ-i + IIK*)l| 2 



(5.3-19) 



(5.3-20) 



In order to impose the zero terminal condition (5), we see from (13) that the 
iterations (14) must be initialized from fi(0) = O n xn- In such a case, the following 
lemma insures nonsingularity of Cl(k), provided that k > dim<I>. 

Lemma 5.3-2. Consider the solution f2(fc) of the forward RDE (14) initialized 
from fi(0) = O n xn- Then, provided that ($, G) is a reachable pair, 



detn(n + i)^0, Mi > 



(5.3-21) 



with n = dim<f>. 



Proof(by contradiction) First, by nonsingularity of complete reachability of (<£, G) is equivalent 
to complete observability of (<I>~ T ,G'), i.e. complete observability of the dynamic system (19). 
Next, consider the LQOR problem (19)-(20). Assume that there is a vector £, £ ^ O n , such 
that £'fi(fc)£ = 0. This means (Cf. (2.4-15)) that if the plant initial state is £, the optimal input 



sequence u® Q fc j is such that Yli=o 



0, y°(i) being the output of (19) 
at the time i for £(0) = £ and inputs uf Q iV Then, ujjj fe j and y^ Q ^ arc both zero. For k > n, this 



lif°(0ll*-i+ll« o (0ll 2 



>,i) 

contradicts complete observability of (19). 



*[0,fc) 



(5.3-22) 



Proof of Theorem 3-1. Complete reachability insures that problem (10) is solvable for any N > n. 
Concomitant minimization of (9) yields uniqueness of u® N y Further, according to (21), F(N), 
N > n, is computable via (14) and (15). By (11), (12) and (21), wc find 

P(jfc) = <S>'n- 1 (k)<S> + tl; x , Vfc>n. 

Further, 

n(i) > ci(o) = o nxn 

implies, by (14) and Proposition 2-1, that 

n(fe + i) > n(k) , vfc > o 

Then, from (22) it follows that 

P{k + 1) < P(k) , Vfc > n 
By the FARE argument, we conclude that the state-feedback gain-matrix 



F(k + 1) 



V> u + G'P(fc)G 



G'P(fc)* , Vfc > n 



stabilizes the plant (1). 
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The major limitation of the stability results of Theorem 1 lies in the nonsingularity 
assumption on the plant state-transition matrix In this respect, one could argue 
that this is not a real limitation, since a discrete-time plant resulting from using a 
zero-order-hold input and sampling uniformly in time any continuous-time finite- 
dimensional linear time invariant system, has always a nonsingular state-transition 
matrix for suitable choices of the state. In fact, if the continuous-time system is 
given by dz ^ — Az(t) + Bv(t), t € R, and T s is the sampling interval, the corre- 
sponding discrete-time plant x(t + 1) = &x(t) + Gu(t), tgS, has <& = exp(AT s ) 
for x(t) = z(tT s ). The conclusion, hence, would be that, since our interests are 
mainly in sampled data plants, $ singularity entails little limitation. However, the 
situation is not so simple, since we arc also interested in controlling sampled-data 
continuous-time infinite-dimensional linear time-invariant systems, viz. systems 
with deadtime or I/O transport delay. In such a case the previous argument does 
not hold true any longer. 

Problem 5.3-1 Consider a SISO plant described by the following difference equation 

y(t) + aiy(t - 1) + • • • + a na y(t - n a ) = b lU (t - 1) + • • • + b nb u(t - n b ) (5.3-23) 

with a„ a ■ b nb ^ 0. Show that its state— space canonical reachability representation (2.6-8) has 
nonsingular state-transition matrix if and only if n b <n a . 

Problem 5.3-2 Consider the plant of Problem 1 but now with an extra unit I/O delay, e.g. its 
output 7(t) is obtained by delaying y(t) by one step, i.e. ^(t) = y(t— 1). Show that this new plant, 
if n b = n a , has state-space canonical reachability representation with singular state-transition 
matrix. 



Another situation in which we have to tackle a singular <J> arises when we want to 
use the RHR law (7) with a state x(t) which does not coincide with z(tT s ). This 
happens for instance in the practically important case where the a;(i)-components 
are externally accessible variables such as past I/O pairs. 

Example 5.3-2 Consider the SISO plant described by the difference equation (23). Let 
(Cf. Problem 2.4-5) 



where u 
follows 



with 



s(t) :-- 
[ u(t — n) 



t-n +l' 



l t-l 



£ R"a +"1,-1 



(5.3-24) 



u(t) ]'. Then (23) can be represented in state-space form as 



O 



s(t + l) 

y(t) 

(n.-l)xl 



*s(t) + Gu(t) 
Hs(t) 



O 



-ai 



(n a -l)x(n b -l) 
... b2 



o 



(n b -2)x(n a + l) 







b 1 







En a +n b -l 



H : 



(5.3-25) 



(5.3-26) 
(5.3-27) 



where := [0 ■ ■ ■ 1 • • • 0]'. Note that $ is singular, though, under appropriate assumptions 

i 

on (23), (&,G,H) is completely reachable and reconstructiblc. 
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Finally, the nonsingularity condition on <f> rules out plants described by FIR models. 
Since FIR models can approximate as tightly as we wish open-loop stable plants, 
the lack in this case of a proof of the stabilizing property for the RHR law (7) 
appears conceptually disturbing and restrictive for some applications. For the 
above reasons, we shall now move on proving that the stabilizing properties of 
the zero terminal state RHR extend to the general case of a possibly singular 
state-transition matrix. Such an extension is obtained via two different methods of 
proof. These methods hinge upon two different monotonicity properties of the cost: 
one upon the cost monotonicity w.r.t. the increase of the prediction horizon for a 
fixed initial state; the other upon the cost monotonicity along the trajectories of 
the controlled system for a fixed prediction horizon length. While the former proof 
relics again the FARE argument, the latter does not use linearity of the plant and, 
hence, can also cover nonlinear plants and input and state-related constraints. 

Method of proof 1 Consider again the plant (1-1) with S = (Q,G,H), com- 
pletely reachable and detectable. Let u, v < n = dim<f>, be the controllability index 
of £ [Kai80] , viz. the smallest positive integer such that the v-th order reachability 
matrix R u of £ 

R v : = [ $"-iG ••• $G G ] 

has full row-rank. Then, there arc input sequences i*[o,i/) which satisfy the zero 
terminal state constraint 

O x = x(v) = + Rviil^ (5.3-28) 
for any initial state x := x(0). For every U[o )t /) satisfying (28) we write 

v-l 

J (x, U[Oil0 I x(u) = O x ) := 1 (V(k),u(k)) (5.3-29) 

fc=0 

i{v{k)M*))~\\vm% v + \wmi u 

with ip y — Tp' y > and ip u — ip' u > 0. Defining 

V v (x | x(v) = O x ) := min J (x, u, u) \ x(y) = O x ) (5.3-30) 

"[CO 

we show next that, irrespective of the possible singularity of we have 

V„ (x | x(u) = O x ) = x'P{v)x (5.3-31a) 

P(v) = P» > (5.3-31b) 

Hereafter, we show how to compute P{v) without resorting to (22). To this end, 
set 

y(k) = Hx{k) 

= wiu(k-l)-\ h w k u(0) + S k x 

where 
and 

S k := H$ k 
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Hence, 
where 



vl_ 1 =Wu a v _ 1 +Tx 



W := 



Wi 

W v -1 W v - 2 



and 

Therefore 



r :— [ S[ S' 2 ■ ■ ■ s' v _ 1 



(5.3-32) 



(5.3-33) 



l»i-illL + ll«2-illL 



J2Hy(k),u(k)) = \\y(0)\\l y 

= IbWII^ + IIW^ + r^l^ + H^II^ (5.3-34) 



fe=0 



where 



L y := block-diag {^ v } , ((v — l)-times) 
L u := block-diag {ip u } , (f— times) 

In order to minimize (29) under the constraint (28) we form the Lagrangian function 
[Luc69] 

v-l 



k=0 



A 



where A e H n is a vector of Lagrangian multipliers. The gradient of C w.r.t. u®_ 1 
vanishes for 



#-1 = -M- 1 



W'L y Tx+ l -R' v X 



(5.3-35) 



M := L u + W'LyW 



Premultiplying both sides of (35) by R v , we get 



Ruul-i = -R V M- X 

= -<S> v x [(28)] 

Using (36) into (35), we find 



W'L y Tx+ l -R' v \ 



(5.3-36) 



= -M" 1 - QRyM- 1 } W'LyT + Q$" x (5.3-37) 

Q:=K (RuM- 1 ^ 1 

Thus, P(v) in (31) can be found by using (37) into (34). Hence, (31) is established 
without requiring nonsingularity of 
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Consider next the zero terminal state regulation over the interval [0, v\. Taking 
into account (31a), we have 

V v+1 {x | x(v + 1) = O x ) = (5.3-38) 
= mm { ||2/(0) \\% y + H0) \\l u +V{x{l) \ x(u) = O x )} 

= min {\\y(0)\\X + \H0)\\l u + ^(1)^(1)} (5-3-39) 



where 

x(l) = $x + Gw(0) 

Eq. (38) is the same as a Dynamic Programming step in a standard LQOR problem 
and yields (Cf. Theorem 2.4-1) 

u(0) = - [V>„ + G'Pi^GY 1 G'P{v)<5>x (5.3-40) 

K+i (x | x(v + 1) = O x ) = x'P{u + l)x (5.3-41) 

P{v + 1) = <5>'P{v)<5> - $'P(z/)g(v„ + G'P{v)G) G'P{v)§ + i?V yJ ff 
Further, 

P(v + 1) < P(v) (5.3-42) 

In fact, 

x'P(^+l)x = min J (x, ui „i | x(f + 1) = O x ) 

< J (a;, U[ 0)I/ ) ® | a;(i/ + 1) = O x ) 
= J (a;,M[ 0jI/ ) | x(v) = O x ) = x'P(y)x 

Here U[o,iy) denotes the input sequence in (37) and ® concatenation. By RDE 
monotonicity (C/. Proposition 2-1), (41) yields 

P(k + 1) < P(fc) , Vfc > i/ (5.3-43) 

Then, by the FARE Argument of Sect. 2 we conclude that the RHR law related to 
the zero terminal state regulation problem (3)-(5) yields an asymptotically stable 
closed-loop system whenever N > v + 1 , irrespective of the possible singularity of 

Theorem 5.3-2. Consider the zero terminal state regulation (3)-(5) with ip y > 
and ip u > 0. Let the plant (1-1) be completely reachable and detectable with 
reachability index v. Then, irrespective of 'the possible singularity of<fr, the RHR law 
(7) relative to (3)-(5) exists unique for every N > v and stabilizes (1-1) whenever 

N > v + 1 (5.3-44) 

Further for single-input completely reachable plants, irrespective of ip y and ip u , (7) 
yields state-deadbeat regulation whenever 

N = n (5.3-45) 
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Problem 5.3-3 Consider the plant (1-1) and the zero terminal state regulation problem (3)-(5). 
Show that the conclusions of Theorem 2, except possibly for the dcadbeat result, are still valid 
with (8) replaced by N > max{i/ r + l,Uf}, provided that the plant is completely controllable, 
the completely reachable subsystem of the GK canonical reachability decomposition of (1-1) has 
reachability index v r , and Vf is the smallest nonnegative integer such that (&f) L/f = O nf x?if , $f 
being the state-transition matrix of the rif-dimcnsional unreachable subsystem. 



Method of proof 2 The second stability proof is based on a monotonicity 
property similar but yet different from the one in (43). Its interest consists of the 
fact that it encompasses nonlinear plants 

x(k + l) = ip(x{k),u(k)) (5.3-46a) 
y(k) = r]{x{k)) (5.3-46b) 

for which O n is an equilibrium point 

O n = <p(O n ,O m ) (5.3-47a) 
O p = r/(O n ) (5.3-47b) 

Assume that for such a plant the problem (3)-(6) is uniquely solvable with (3) and 
(6) replaced respectively by u(t + N - k) = f (k, x(t)) and u(t) = f (N, x{t)). We 
shall refer to the latter as the zero terminal state RHR feedback law with prediction 
horizon N . Under the above assumption consider for a fixed N the Bellman function 
V(t) := V N (x(t) | x(t + N) = O n ), the R.H.S. being defined as in (30), along the 
trajectories of the controlled system. Let U[ t ^+N) be the optimal input sequence 
for the initial state x(t). We see that M[ t+ i )t+ jv) <g> O m drives the plant state from 
x(t+l) = <p (x(t),u(t)) to O n at time t+N and hence by (47) also at time t+l + N. 
Then we have, by virtue of (47), 

V(t)-V(t+l)>\\y(t)\\l y + \\u(t)\\l u (5.3-48) 

Therefore {V(t)}^l is a monotonically nonincreasing sequence. Hence, being V(t) 
nonnegative, as t — > oo it converges to Voc, < V^, < V(0). Consequently, 
summing the two sides of (48) from t — to t — oo, we get 

oo 

^o > V(0) V(oo) > [\\y(t)\\l y + \Ht)\\l u ] (5-3-49) 
t=o 

This, in turn, implies for ip y > and ip u > 

lim y(t) = O p and lim u(t) = O m (5.3-50) 

Theorem 5.3-3. Suppose that the zero terminal state regulation problem (3)-(6) 
with ip y > 0, ip u = and (3) and (6) replaced respectively by u(t + N — k) — 
f (k, x(t)) and u(t) = f (N, x(t)) be uniquely solvable for the nonlinear plant (46)~ 
(4V- Then, the zero terminal state RHR law yields asymptotically vanishing I/O 
variables. 

Remark 5.3-1 



i. For linear controllable and detectable plants, Theorem 3 implies at once 
asymptotic stability of the controlled system. 
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ii. The method of proof of Theorem 3, though simple and general, does not 
unveil the strict connection between zero terminal state RHR and LQR, nor 
solvability conditions, issues which are instead explicitly on focus in the con- 
structive method of proof of Theorem 2. 

iii. The method of proof of Theorem 3 can be used to cover also the case of 
weights ip y (i) > and ip u (i) > 0, i = 0, 1, • • • , N — 1, in (3). In such a case 
the conclusions of Theorem 3 can be readily shown to hold true provided that 

ip y (i) < ip y (i + 1) and ip u (i) < ip x (i + 1) 

fori = 0,1, •••,7V -2. 

iv. Theorem 3 is relevant for its far-reaching consequences on the stability of 
the zero terminal state RHR applied to nonlinear plants once solvability is 
insured and complementary system theoretic properties are added. 

v. The reader can verify that the presence of hard constraints on input and 
state-dependent variables over the semi-infinite horizon [t, oo) is compatible 
with the method of proof of Theorem 3. This makes the results of Theorem 
3 of paramount importance for practical applications where control problems 
with constraints are ubiquitous. 

□ 



Main points of the section For completely reachable and detectable plants, zero 
terminal state RHR yields an asymptotically stable closed-loop system whenever 
the prediction horizon is larger than the plant reachability index. 



5.4 Stabilizing Dynamic RHR 

We concentrate on SISO plants, initially with unit I/O delay, viz. the intrinsic one. 
Later, we shall take into account the possible presence of larger I/O delays. 

Let us consider the SISO plant described by the difference equation (3-23) . We 
shall rewrite it formally by polynomials as follows (Cf. Sect. 3.4) 

A(d)y{t) = B(d)u(t) (5.4-1) 

where 

A(d) = 1 + cud + ■ ■ ■ + a„ a <P a (5.4-2) 

B{d)= hd+ ■■■ + &„„ cT (5.4-3) 

are coprime polynomials with o„ a • b rib ^ and ^ Consider the state 

(5.4-4) 
with 

n a := dA(d) and n b := dB(d) (5.4-5) 

The following lemma points out some structural properties of the state-space rep- 
resentation of (1) with state vector (4). 



t-n b +l\' 
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Lemma 5.4-1. Consider the plant (1). Let the polynomials A(d) and B(d) in (2) 
and, respectively, (3) be coprime. Then, the state-space representation of (1) with 
state-vector (4-) is completely reachable and reconstructible. 

Proof Reachability is discussed in Problem 2.4-5. Reconstructibility trivially follows from the 
state choice (4). 

Next lemma specializes the stability results of Theorem 3-1 and Theorem 3-2 to 
SISO plants. 

Lemma 5.4-2. Let the SLSO plant £ = (<f>, G, H) be completely reachable and 
detectable. Then the RHR law (3-7) relative to the zero terminal state regulation (3- 
3)-(3-5) stabilizes the plant, whenever the prediction horizon satisfies the following 
inequality 

iV>n x :=dim£ (5.4-6) 

According to Lemma 1 and Lemma 2, we can directly construct a stabilizing I/O 
RHR for the SISO plant (l)-(5) as follows. Find an optimal open-loop sequence 
U[o,n) t° the plant initialized from the state s(0) 

u(N-k) = F(k)s(0) , k = l,---,N (5.4-7) 

minimizing the performance index (3-4) under zero terminal state constraint 

= [ (yN- na+1 ) / (<:r +1 )' ]' = o na+nb -i (5.4-8) 

Then, the feedback regulation law 

u(t) = F(N)s(t) (5.4-9) 

is the I/O RHR law of interest. 

We simplify the above formulation by referring to the extended state 

5(t):=[ (yl- n+1 )' Kir 1 )']' M-IO) 

where 

n := max(n a , n&) (5.4-11) 

denotes the plant order. In fact, s(t) has the same reachability and reconstructibil- 
ity properties as s(t) (Cf. Problem 2.4-5). Thus, referring to s(t), taking into 
account the implication on the summation (3-4) of the terminal state constraint 
s(N) = C>2n-i, and setting 

T:=N-n+l (5.4-12) 
we can adopt the following formal statement. 

Stabilizing I/O RHR (SIORHR) Consider the problem of finding, when- 
ever it exists, an input sequence U[o.t) 

u(k) = J r (fc)s(0) , k = 0, • • • , T - 1 (5.4-13) 

to the SISO plant (l)-(5) minimizing 

T-l 

J (s(0), u [0 ,r)) = E [^y 2 ( k ) + ^u 2 (k)] (5.4-14) 

k=0 
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«(*) 



y(t) 1 1 y{t)=y{t-l) 



Figure 5.4-1: Plant with I/O transport delay I. 
under the constraints 

UT+n-2 = O n -l VT+n-1 = On (5.4-15) 

Then, the dynamic feedback regulation law 

u(t) = ^(O)s(t) (54-16) 

will be referred to as the Stabilizing I/O RHR (SIORHR) law with prediction 
horizon T and design knobs (T, ip y ,ip u ). 

Considering that (6), referred to the state-vector s(t), yields N > 2n — 1 or T > n, 
we arrive at the following conclusion. 

Theorem 5.4-1. Let the polynomials A{d) and B{d) be coprime. Then, provided 
that 

i> u > (5.4-17) 
the I/O RHR law (16) stabilizes the SISO plant (1) whenever 

T>n (5.4-18) 

Further, irrespective ofip u , for 

T = n (5.4-19) 
(16) yields a state -deadbeat closed-loop system. 

Next step is to extend the above stabilizing properties to plants not only with 
the intrinsic I/O delay but possibly exhibiting arbitrary I/O transport delays. In 
particular, we focus on plants described by difference equations of the form 

A(d)y(t) = d e B(d)u(t) (5.4-20) 
= B(d)u(t) 

Here: B(d) := d l B(d); I can be any nonnegative integer and denotes the plant 
deadtime or I/O transport delay; A(d) and B(d) are polynomials as in (2) and (3). 
Let us assume that A(d) and B(d) are coprime. Then, for I = 0, (20) satisfies the 
conditions in Theorem 1 under which stabilizing I/O RHR exists. In order to deal 
with a nonzero deadtime, it is convenient to represent the plant (20) as in Fig. 1. 
Here, an intermediate variable y(t) = y(t + I) is indicated. 

The I/O RHR (11)— (16) with y(k) replaced by y(k) — y(k + £) yields, according 
to Theorem 1 , a stabilizing compensator 



u(t) = T(0)s(t) 



(5.4-21) 
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(5.4-22) 



whenever T > n. The problem here is that (21) is anticipative. However, by noting 
that the system in Fig. 1 with output y(t) has a state-space description ($, G, H) 
with state s(t), one sees that the anticipative entries of s(t) can be expressed in 
terms of y* and w t_1 (2) as follows for i = 0, 1, • • • I — 1 

y{t + £-i) = y{t-i) (5.4-23) 
= if [Gu(t - * - 1) + ■ ■ ■ + ^- i_1 Gu(i -£)+ ¥- l s(t - £)] 

Hence, we can conclude that (21) can be uniquely written as a linear combination 
of the components of a new state 



s(t):=[ (Vr"» +1 )' 



(5.4-24) 



Therefore, in the presence of a plant deadtime £, (13)— (16) are modified as follows. 



SIORHR (£ >0) Let s(t) be as in (27). Consider the problem of finding, 
whenever it exists, an input sequence M[o,t) 

u(k) = J r (fc)s(0) , k = 0, • • • , T - 1 (5.4-25) 

to the SISO plant (20) minimizing 

T-l 

J (a(0),u [0 ,n) = E [^^ 2 ( fc + £ ) + ^" 2 (fc)] (5-4-26) 

fc=0 

under the constraints 

4+„-2 = O n -! vlli+n-1 = °™ (5.4-27) 

Then, the SIORHR law is given by the dynamic feedback compensation 

u(t) = ^(0)s(t) (5.4-28) 

We note that (25)-(28) subsume (13)— (16). Nonetheless, according to our consider- 
ations preceding (25), the conclusions of Theorem 1 hold true in this more general 
case. 

Theorem 5.4-2. Consider the SISO plant (20) with deadtime i and A(d) and B(d) 
coprime polynomials. Then, provided that ip u > the SIORHR law (28) stabilizes 
(20) whenever 

T>n (5.4-29) 
n being the plant order. Further, irrespective of ip u > 0, for 

T = n (5.4-30) 

(28) yields a state- deadbeat closed-loop system. 
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The final comment here is that, as can be seen from (25)-(28), SIORHR ap- 
proaches, as T increases, the steady-state LQOR. 

Main points of the section Zero terminal state RHR can be adapted to regulate 
sampled data plants with I/O transport delays. The resulting dynamic compen- 
sator, referred to as SIORHR (Stabilizing I/O Receding Horizon Regulator), yields 
stable closed-loop systems under sharp conditions. By varying its prediction hori- 
zon length SIORHR yields different regulation laws, ranging from state-deadbeat 
to steady-state LQOR. 



5.5 SIORHR Computations 



No mention has been made so far on how to compute (4-28). In fact the formulae 
(3-14) and (3-15) cannot be used, being the $ matrix singular in the case of interest 
to us. We now proceed to find an algorithm for computing (4-28) which, though not 
the most convenient numerically, is helpful for uncovering the relationship between 
SIORHR and other RHR laws to be discussed next. Consider again the state (4-24) 



e R" 3 



for the plant (4-20). It can be updated in the usual way 

s(t+l) = <f>s{t) + Gu(t) 
y(t) = Hs(t) 

for a suitable matrix triplet ($, G, H). Then, similarly to (1-7), 

s(k) = $ k s(0)+R k u° k _ 1 
R k := [ ^> k - x G ••• $G G ] 



y(k) = Ha(k) 

= u>iu(k — 1) + 



where 
and 

Being w\ = w 2 



f w k u(0) + S k s(0) 
H<$> k ~ l G 



W£ 



S k := H^ k 
0, we can also write 



(5.5-1) 

(5.5-2) 

(5.5-3) 
(5.5-4) 

(5.5-5) 

(5.5-6) 
(5.5-7) 

(5.5-8) 



with n as in (4-18), W the lower triangular Tocplitz matrix having on its first 
column the first nonzero T + n — l samples of the impulse response associated with 
the plant transfer function d i B(d)/A(d) 



W:-- 



W£+l 

W£ +2 W£ + l 

Wl+T+n-l Wi+T+n-2 



W£+l 



(5.5-9) 
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Let W and L be partitioned as follows 



W r- 



T-l< 







n{ 




W 2 _ 



r := 



T-l< 
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••• ^+t+„-i ]' (5-5-10) 



(5.5-11) 
(5.5-12) 

(5.5-13) 
(5.5-14) 

(5.5-15) 



T 

r 2 



Then, (8) can be rewritten as follows 

i#T-i = wi«T-i + ris(o) 
«<+?+ n -i = iu T-i + ^?+„_ 2 + r 2S (o) 

and (4-26) becomes 

j( s (o),4_0 = Vy||y^_ 1 ll 2 + V'«ll4-ill 2 

= Vvll^i4-i + ris(o)|| 2 + vgi4-ill 2 

Further, the constraints (4-27) become 

M?+„_2 = On-i (5.5-16) 

Lu\_ x + r 2 s(0) = O n (5.5-17) 

In conclusion problem (4-25)-(4-27) can be solved by finding the vector u T _ l € R T 
minimizing the Lagrangian function 

C := J(s(0),<_ 1 ) + [Lu^ + IV^'A 

where A S R™ is a vector of Lagrangian multipliers. The gradient of £ w.r.t. u T _ 1 
vanishes for 

4_, = -M- 1 Vy^Tis^) + ^L'A (5.5-18) 

M := t/> m / t + i)yW{W! (5.5-19) 
Premultiplying both sides by L, we get 

Lu§._! = -^XM-^iTiS^) - l -LM- Y L'\ 

= -r 2 s(0) [(17)] (5.5-20) 

Using (20) into (18), we find 

u§._! - -M" 1 [V>„ (I T - QLM- 1 ) W{T 1 + QT 2 ] s(0) (5.5-21) 

Q := L' (LM _1 L') _1 (5.5-22) 
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We note that Q exists provided that the matrix L in (11) has rank n. Now 

(5.5-23) 



L = 



Wl+T Wl+T-l ■ ■ ■ Wi +2 Wg + i 

W£+T+l W£+T ■ ■ ■ W£ +3 W l+2 



We+T+n-l Wl+T+n-2 ' ' ' We+n+1 Wl+n 



is obtained by reversing the column order of the Hankel matrix [Kai80] Hn,T asso- 
ciated with d l B(d)/A(d). Then, under the same assumptions as in Theorem 4-2, 
rankL = n, provided that T > n. 

Theorem 5.5-1. Under the same assumptions as in Theorem 4-2, the open-loop 
solution of (4-25)-(4-27) for T > n is given by (21). Then, the I/O RHR 

u{t) = -e^M" 1 [i> y (I T - QLM~ X ) W{T 1 + QT 2 ] s(t) (5.5-24) 

stabilizes the plant whenever 

T>n (5.5-25) 

Further, for T = n, (24) becomes 

u{t) = -e' 1 L- 1 r 2 s(t) (5.5-26) 
and yields a state- deadbeat closed-loop system. 

Eq. (24) is a formula for computing SIORHR. There are, however, various ways 
to carry out the involved computations. Two alternatives are considered hereafter. 

One basic ingredient of (24) is (5). In fact, the plant parameters in (5), viz. 
Wi, i — 1,2, and Sk, are needed to compute Tis(t) and T2s(t) in (24). 

In the present deterministic context, we shall refer to (5) as the k-step ahead 
output evaluation formula in that it yields y(k) in terms of the state s(0), once the 
exogenous sequence uiq^) is specified. The use of a special name for (5) is justified 
by the fact that, as will be seen in the remaining part of this chapter, (5) plays a 
central role in other dynamic RHR problems. We describe two ways to compute 
(5). 

The first, referred to as the long-division procedure, is based on solving w.r.t. 
the polynomial pair (Qk(d),Gk(d)) the Diophantinc equation 



1 = A{d)Q k {d)+d k G k {d) 
dQ k (d) < k- 1 



(5.5-27) 



Being A(d) and d k coprime, the solution of (27) exists unique (Appendix C). Mul- 
tiplying both sides of (4-20) by Qk(d), we get 

y(k) - Q k (d)B(d)u(k) + G k (d)y(0) (5.5-28) 

It is immediate to relate the polynomials in (28) with the parameters Wi and Sk in 
(5), e.g. w\, ■ ■ ■ , Wk are given by the k leading coefficients of Qk{d)B{d). 

It is to be pointed out that there exist recursive formulae for computing (Qk+i(d), Gk+i(d)) 
given (Qk(d),Gk(d)). To see this, it is enough to consider that (27) represent the 
k-th stage of the long-division of 1 by A(d), with Qk(d) = qo + q\d+- ■ ■ + qk-id k ~ 1 , 
the quotient, and d k Gk(d), the remainder. Thus, going to the next stage of the 
long-division, we have 

Qk+i{d) = Qk{d)+q k d k (5.5-29) 
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and d k+1 G k+1 (d) = d k G k {d) - q k d k A(d) or 

dG k+1 {d) = G k {d) - q k A(d) (5.5-30) 
Setting d = in (30), we find 

q k = G fc (0) (5.5-31) 

Then, Q k+1 (d) and G k+ \{d) can be recursively computed via (29)-(31) initialized 
as follows: 

Qi(d) = l ; dGx(d) = I - A(d) 

The other way to compute (5) will be referred to as the plant response procedure. In 
order to describe it, it is convenient to denote by S {k, s(0), u\_A the plant output 
response at time k, from the state s(0) at time 0, to the inputs u k _ v Then, we 
have 

w k =S(k,s(0)=0,u° k _ 1 = e 1 ) (5.5-32) 

where e x is the first vector of the natural basis of R fc . Further, if S k;r denotes the 
r-th component of the row vector S k , we have 

S k , r = S{k,e r e R n ',O fc ) (5.5-33) 

Here e r denotes the r-th vector of the natural basis of R" s . 

We finally mention another possibility closely related to the plant response 
procedure. We see that the last additive term of (5) can be obtained as follows 

p(k) := S k s(0) (5.5-34) 
= S(k,s(0),O k ) 

Then, (24) can be computed via algebraic operations on data obtained by running 
the plant model as in (32) and (34) and setting 

r lS (0) = [ p{£+l) ■■■ p{£ + T-l)]' (5.5-35) 

and 

r 2 s(0) = [ p(£ + T) ••• p (t + T + n-l)]' (5.5-36) 

While in the present deterministic context when the plant is preassigned and fixed, 
the use of (32) and (33) appears the most convenient in that it allows us to compute 
the feedback in (24) once for all, when the plant is time-varying, e.g. in program 
control or adaptive regulation, the use of (32), (34)-(36) can become preferable. In 
fact, the latter procedure circumvents an explicit feedback computation by provid- 
ing us directly with the control variable u(t) in (24). 

Main points of the section The SIORHR law is given by the formula (24). 
This can be used provided that the plant parameters in the fc-step ahead output 
evaluation formula (5) are computed. To this end, two alternative procedures are 
described: the long-division procedure (27)-(31); and the plant response procedure 
(33) or (32) and (34)-(36). 

Problem 5.5-1 Consider the problem of finding an input sequence M[o,T+n); u(k) = ^ r (fe)s(0), 
k = 0, • • • , T + n — 1, to the plant (4-20) minimizing 

T-l T+n-1 

fc=0 fc=T 
for ip y > 0, tpu > and A > 0. Show that the limit for A — * oo of the solution of such a problem 
coincides with (21), provided that the latter is well-defined. 
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5.6 Generalized Predictive Regulation 

Generalized Predictive Regulation ( GPR) is a form of dynamic compensation stem- 
ming from a RHR problem similar to the one of (4-13)-(4-16). 

GPR Define s(t) as in (4-14). Consider the problem of finding, whenever it 
exists, the input sequence U[o,iv u ) 

u(k) = JF(/c)s(0) k = 0,---,N u -l (5.6-1) 

to the SISO plant (4-1)- (4-3) minimizing for N u < N 2 

N 2 N u -1 
k=N ± k=0 

under the constraints 

Unz-i = On 2 - Nu (5.6-3) 
Then, the dynamic feedback regulation law 

u(t) = ^"(O)s(t) (5.6-4) 

is referred to as the GPR law with prediction horizon N 2 and design knobs 
(N u N 2 ,N u ,ip u ). 

If iVi = we have 

N u -1 N 2 

Jgpr= £ [y 2 (fc)+^ 2 (fc)]+ ^ y 2 (k) (5.6-5) 

fc=0 fe=Ar„ 

Then, under (5), as N u — > oo GPR approaches the steady-state LQOR. The stabi- 
lizing properties of the latter are then inherited by GPR as N u — > oo. This feature 
does not reassure us on the stability of the GPR compensated system for small 
or moderate values of N u . It is in fact to be pointed out that, since the GPR 
computational burden greatly increases with N u , it is mandatory, particularly in 
adaptive applications, to use small or moderate values of N u . On the other hand, 
we already know (Sect. 2.5) that the RDE is slowly converging to its steady-state 
solution as its regulation horizon increases. For this reason the claim that GPR 
stabilizes the plant for a "large enough" N u , even if true, is of modest practical 
interest since the values of N u for which GPR is stabilizing may be too large. In 
addition, such values are in general not so easily predictable, their determination 
typically requiring computer analysis. 

A design knob choice under which, if A(d) and B(d) are coprime, GPR can be 
proved [CM89] to be stabilizing is the following. 

Ni = N u > n , N 2 - JVi > n - 1 , V« I (5.6-6) 

Even if limiting arguments are avoided, the difficulty here is that one has to take a 
"vanishingly small" value for ip u . For a given plant, it is not immediate to establish 
how small such a value must be so as to guarantee stability to the closed-loop 
system. In this respect, for the second order plant 



A(d) = l-0.7d B(d) = 0.9d - 0.6d 2 
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and GPR design knob settings 

JVi = N u = 2 and N 2 = 4 , 

a computer analysis, reported in [BGW90], shows that, in order for closed-loop 
stability to be achieved, it is required to take tp u < 5.2 x 10~ 13 ! 

To the above inconvenience we may add that the design knob selection (6) makes 
GPR close to a dynamic version of part a) of Theorem 3-1 which, as remarked, is 
of little interest in applications. 

In order to uncover GPR stabilizabilization properties for finite prediction hori- 
zons, it is convenient to look for conditions under which GPR and SIORHR coin- 
cide. By comparing (3) with (4-14), we see that if 

N u — T and N 2 = T + n - 1 (5.6-7) 

GPR and SIORHR input constraints coincide. Further, if 

Ni =T and ?/>„ = (5.6-8) 

(2) becomes 

T+n-l 

Jgpr= (5-6-9) 

k=T 

and the constraints (3) are now 

UT+n-2 = 0„_! (5.6-10) 

We already know from Theorem 4-1 that for T = n there exists a unique input 
sequence it[o, n ) making (9) equal to zero. Then, the GPR law related to (7) and 
(8) yields for T = n a state-deadbeat closed-loop system, provided that the poly- 
nomials A(d) and B(d) in the plant transfer function are coprime. We also see that 
the above state-deadbeat property is retained if, keeping the other design knobs 
unchanged, N 2 exceeds 2n — 1 

N 2 >2n-l (5.6-11) 

Other GPR stabilizing results for open-loop stable plants are reported in [SB90] . 

Referring to the state-space representation with state-vector s(t), the GPR 
law can be computed via the related forward RDE, taking into account that the 
constraints (3) amount to setting ip u = oo for the first N 2 — N u iterations initialized 
from -P(O) = H'H. However, it is more convenient, to follow an approach similar 
to the one adapted for SIORHR in Sect. 5. We first explicitly consider the presence 
of a plant I/O transport delay i as in (4-20) by restating for such a case the GPR 
formulation. 



GPR (£ >0) Let s(t) be as follows 

,(*):=[ ( y r- +i )' (u\-_[- n " +i ) 

Find, whenever it exists, an input sequence U[o,jv„) 



(5.6-12) 



u{k) =F{k)s{Q) 



(5.6-13) 
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to the plant (4-20) minimizing for N u < N 2 



N 2 N u -1 

Jgpr= ]T y 2 (k + £) + ^u £ 

fe=JVi 



fe=0 



under the constraints 



% 2 -l = U N 2 -N U 



Then, the dynamic feedback regulation law 

u{t) = T{0)s{t) 



(5.6-14) 



(5.6-15) 



(5.6-16) 



is referred to as the GPR law with prediction horizon N 2 and design knobs 
{Ni, N 2 , N u , ip u ). 

Let W be the lower triangular Tocplitz matrix defined as in (5-9) but taken here 
to have dimension N 2 x N 2 



W :-- 



Wi+l 

we+2 w e+ i 



we+N 2 we+N 2 -i ■ ■ ■ w e+1 
Let T be as in (5-10) but taken here to have N 2 rows 

T := [ S'e+i S'e+2 ' ' ' S'e+N 2 ] 
with Sk as in (5-7). Let W and L be partitioned as follows 



N!-l< 



W :-- 



III III' 
W G III . 

T n-1 



N 2 



Afi-li 



L := 



/// 



N 2 



Then 



or 



Ni 

Vn 2 



yi N2 = Wu° N2 _ 1 + r s (o) 
III III 



N u -1 

%2-l 



VNi-1 

W G III 
Taking into account (15), we get 

y N N \ ^W G u o Nu _ 1+ T GS (0) 

Provided that 

M G := W' G W G + ^ u I Nu 



III 

r G 



5(0). 



(5.6-17) 



(5.6-18) 



(5.6-19) 



(5.6-20) 



(5.6-21) 
(5.6-22) 
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is nonsingular, it follows that 

«5v u -i - -M^W' G T G s{Q) (5.6-23) 

Once the GPR the design knobs are set as in (8) and T = n, we find that (23) 
equals L~ 1 r 2 s(i) with L and T 2 as in (5-26). Therefore our remarks on the state- 
deadbeat version of GPR after (7) are here definitely confirmed. Next theorem 
sums up some of the results so far obtained on GPR. 

Theorem 5.6-1. Provided that the matrix Mq in (22) is nonsingular, the GPR 
law is given by 

u(t) = -^M^W'gYgs^) (5.6-24) 
Under the design knob choice 



N 1 =N u = n; N 2 > 2n - 1 ; tp u = (5.6-25) 

and provided that A(d) and B(d) in (4-20) are coprime, (24) yields a state-deadbeat 
closed-loop system. 



Problem 5.6-1 Adapt both the long-division procedure and the plant response procedure of 
Sect. 5 to compute the GPR law (24). 



If we compare (24) with (5-24), we see that, from a computational viewpoint, GPR 
is less complex than SIORHR. The latter, in fact, to be computed requires to invert 
the matrix LM~ l L' which, in turn, is nonsingular if and only if rankL = n. On 
the contrary, GPR requires only the inversion of Mq which is nonsingular whenever 
ipu > 0. 

Main points of the section Like SIORHR, GPR is obtained by solving an input 
output RHR problem. However, unlike SIORHR, which involves both input and 
output constraints, in GPR only input constraints are considered. Consequently, 
GPR has a lower computational complexity than SIORHR. The price paid for it is 
that only few sharp results on GPR stabilizing properties are available. 

Problem 5.6-2 (Connection between zero terminal state RHR and GPR for FIR plants) Con- 
sider the SISO finite impulse response (FIR) plant 

y(t) = w\u(t — 1) H h w n u(t - n) (w n ^ 0) 

with state-vector 

x(t) : = 



t — n \ 



and the following zero terminal state regulation problem. Find, whenever it exists, an input 
sequence W[o,jv) > u(k) = F(k)x(0), k = 0, 1, • • ■ , N — 1, to the above plant minimizing 

JV-l 

j(x(0),« [0 , J v)) = [y 2 {k) + pu 2 {k)} (p>0) 

k=0 

subject to the constraint x(N) = O n . Show that: i) the problem above is solvable for N > n and 
the related RHR law u(t) = F(0)x(t) stabilizes the plant; and ii) the GPR problem with TVi = 0, 
N2 = N — 1, N u = N — n and tp u = p is equivalent to the above zero terminal state regulation 
problem. 
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5.7 Receding Horizon Iterations 

Modified Kleinman Iterations 

In view of their control-theoretic interpretation as depicted in Fig. 2.5-1, Kleinman 
iterations are closely related to RHR. In fact, in Kleinman iterations the feedback 
updating from F k to F k+ i is based on the solution of the RHR problem (1-1)— (1-6) 
with prediction horizon of infinite length 

J(x,u(-)) = f^[\\x(j)\\l x + \\u(j)\\l u ] (5.7-1) 

3=0 

and the constraints (1-4) given as follows 

u(j) = F k x(j), j = l,2,---, (5.7-2) 

As already remarked, Kleinman iterations have fast rate of convergence in the vicin- 
ity of the steady-state LQOR feedback but are affected by one major defect: they 
must be imperatively initialized from a stabilizing feedback-gain matrix. Further, 
they have two other negative features: their speed of convergence may slow down 
if Fk is far from the steady-state LQOR feedback -Flq; and at each iteration step 
the computationally cumbersome Lyapunov equation (2.5-3) has to be solved. On 
the contrary, Riccati iterations do not suffer from such difficulties, but their speed 
of convergence is not generally fast. It is worth trying to combine Kleinman and 
Riccati-like iterations so as to possibly obtain iterations not requiring a stabilizing 
initial feedback and having a more uniform rate of convergence. We discuss one 
such a modification hereafter. 

Rewrite (1) and (2) in a single equation 

oo 

J(x,u,F k ) = \\x\\l x + \\u\\l u + J2\\ x U)\\l x (k) (5-7-3) 
where x — x(0), u = u(0) and 

4> x {k) := ip x + F' k ^ u F k (5.7-4) 
The last term on the R.H.S. of (3) can be reorganized as follows 

T oo 

£lKi)llLw + £ IWi)lll( fe )= (5-7-5) 

J=l ]=T+l 

W)\\ 2 c T(k) + \\x{T+l)\\ 

£(fc) 

where C{k) satisfies the Lyapunov equation (Cf. (2.5-3)) 

£(fc) = $' fc £(fc)$ fc +^x(fc) (5.7-6) 

and Cx{k) is given by 

T-l 

Cr{k) := Y,(^) r MkW k 

r=0 

= <t>i k £ T (k)<t> k + Mk)-(<S>' k fMk)<S>l (5.7-7) 
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In both (7) and (8) &k denotes the closed-loop state transition matrix 

$ fe := $ + GF k (5.7-8) 

Problem 5.7-1 Prove the identity in the second line of (7). 

The modification we consider consists of replacing the second additive term of (5), 
\\x(T + l)||£( fe ), by ||a;(T+ l)||p( fe ) with P{k) given by the following pseudo-Riccati 
iterative equation 

P{k) = *' k P{k-l)* k +1> x {k) (5.7-9) 

initialized from an arbitrary P(0) = P'(0) > 0. In conclusion, Fk+i is obtained by 
minimizing w.r.t. u, instead of (3), the following modified cost 

Ml x + \\u\\l u + \\^x + Guf u{k) (5.7-10) 

with 

Il(k) = £ T (k) + (& k f P(k)<S> T k (5.7-11) 

Problem 5.7-2 Show that the symmetric nonnegative definite matrix il(fc) in (11) satisfies the 
updating identity 

n(fc) = $' fc n(fc-i)* fc +v*(fc)+ (5.7-12) 
*; [c T (k) - c T (k - 1) + (& k f p(k - - {*' k _ 1 ) T p(k - i)*Li] 

To sum up we have the following 

Modified Kleinman iterations (MKI) Given any feedback-gain matrix 
Pfe and any symmetric nonnegative definite matrix P(k — 1), compute 

T-l 

C T (k) = Y J ^'k) r MkWk (5-7-13) 

Further, update P(k — 1) via (9) to find P(k). Then, compute II(fc) as in 
(11). The next feedback-gain matrix is then obtained as follows 

Pfc+i = - VlJu + G'n(fc)G] _1 G'll(fc)$ (5.7-14) 

Should II(fc) be updated via the Riccati equation 

U(k) = ^' k U(k-l)^ k +^ x (k) (5.7-15) 

under stabilizability and detectability of (<i>, G, H) and ip u > 0, (14) would asymp- 
totically yield the steady-state LQOR feedback-gain. Now the true updating equa- 
tion for II(fc) is not given by (15) but, instead, by (12). The latter is the same as 
(15) except for an additive perturbation term. If F k converges, this perturbation 
term converges to zero. Hence, under convergence, the asymptotic behaviour of 
(9), (11), (13) and (14) coincides with that of (14) and (15). 

Proposition 5.7-1. Let ip u > and (<I>, G, H) be stabilizable and detectable. Then, 
the only possible convergence point of MKI is the steady-state solution of the LQR 
problem for the given plant and performance index (1). 
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Though there is no proof of convergence, computer analysis indicates that Mod- 
ified Klcinman Iterations have excellent convergence properties irrespective of their 
initialization. In particular, if T is at least comparable with the largest time con- 
stant of the LQOR closed loop system, MKI exhibit a rate of convergence close to 
that of Klcinman iterations, whenever initialization takes place from a stabilizing 
feedback. Further, unlike Kleinman iterations, MKI appear to have the advantage 
of neither requiring the solution of a Lyapunov equation at each step nor being 
jeopardized by an unstable initial closed-loop system. 

Example 5.7-1 Consider the open-loop unstable nonminimum-phasc plant A(d)y(t) = B(d)u(t) 
with 

A(d) = (1 - ad) 2 and B(d) =d- 1.999d 2 (5.7-16) 

If a = 1.999, the plant has an almost hidden unstable eigenvalue. If the state x(t) used in 
MKI coincides with that of the plant canonical reachable representation, wc have an almost 
undetectable unstable eigenvalue. If LQOR regulation laws are computed via Riccati iterations 
initialized from P(0) = C>2x2, after the second iteration we get Single Step Regulation and, hence, 
an unstable closed-loop system. By increasing the number of iterations, we eventually obtain a 
stabilizing feedback. Because of the almost undetectable unstable eigenvalue, we expect that the 
first stabilizing feedback is obtained after a quite large number of iterations. Fig. 1 and 2 show 
the closed-loop system eigenvalues when the plant is fed back by computed via MKI. The 
eigenvalues are given as a function of the number of iterations k with T as a parameter. Both 
figures refer to p = tp u /ipy = 0.1 and MKI initialized from P(0) = 02x2 and F(0) = Oi X 2- Note 
that with F(0) = and T = 0, MKI and Riccati iterations yield the same feedback. Fig. 1 and 
Fig. 2 refer to two different choices for a: a = 2 and, respectively, a = 1.999001. They show that 
while Riccati iterations require at least 25 and, respectively, 45 iterations to yield a stabilizing 
feedback, MKI with T = 20 yield stabilizing feedback-gains after 3 and, respectively, 5 iterations. 

Truncated Cost Iterations 

They originate from the RHR problem with performance index 

J(x,u(-)) = f2[\\<ml x + \Hj)\\l u ] (5-7-17) 

3=0 

and constraints 

u(j)=F k x(j), j=l,2,---,T 
Eqs. (17) and (18) can be embodied into a single equation 

T 

J T (x,u,F k ) = \\x\\l x + \\u\\l u +J2Hj)\\l x (k) (5-7-19) 

= \\A\l x + \Hl u + \\x{i)\\l T[k) 

with x = x(0), u = u(0), ip x (k) as in (4), and Cr{k) as in (13). 

Truncated Cost Iterations (TCI) Given any feedback-gain matrix Fk, 
compute Cr(k) as in (13). The next feedback-gain matrix is then given by 

F k +i = - + G'Ct^GY 1 G"£ T (fc)$ (5.7-20) 
As already seen in (7), Cr{k) satisfies the identity 



(5.7-18) 



C T (k) = &kC T {k)$>k + Mk) - (& k ) T Mk)$k 



(5.7-21) 
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Figure 5.7-1: MKI closed-loop eigenvalues for the plant (16) with a = 2. 
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Figure 5.7-2: MKI closed-loop eigenvalues for the plant (16) with a = 1.999001. 
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T 


1 


2 


3 


oo 


#i 


1 


5 


5 




F 


1" 0.6727 ]' 
[ 0.4545 


" 0.6741 1' 
0.4358 J 


1" 0.6752 1' 
[ 0.4465 


1" 0.6757 ]' 
- y 0.4501 J 



Table 5.7-1: TCI convergence feedback row-vectors for the plant of Example 2, 
p = 0.15, zero initial feedback-gain, and various prediction horizons T. 

We shall refer to (21) as the truncated cost Lyapunov equation since, while C(k) in 
(6) equals the series i^'kY ^x{k)^ r k provided that is a stability matrix, 

£r(k) equals the same sum truncated after the T-th term 

T-l 

£ T (k) = J2(&k) r MkWk (5-7-22) 

By the same token, (20) and (22) are called truncated cost iterations (TCI). Al- 
though (20) and (22) do not appear amenable to convergence analysis and, conse- 
quently, sharp stabilizing results are unavailable, we make some considerations on 
TCI behaviour. For the sake of simplicity, we consider a SISO plant A(d)y(t) = 
d i B(d)u(t) as in (4-20) with A(d) and B(d) coprime. We denote again ipu/tpy by p. 
If p = 0, for any T > 1 +£ TCI coincide with Cheap Control. Hence, irrespective of 
initial conditions, in a single iteration TCI yield Fee, the Cheap Control feedback. 
If p > and small, a choice often adopted in practice, two alternative situations 
take place. 

If the plant is minimum-phase and, hence, Fee is a stabilizing feedback we can 
expect that, as long as p > and small so as to make Fqc — Flq, TCI globally 
converge close to such a feedback. Next example is an excerpt of extensive computer 
analysis on the subject confirming the above conjecture. 

Example 5.7-2 Consider the minimum-phase plant A(d)y(t) = B(d)u(t) with 

A(d) = 1 + d + 0.7 Ad 2 B(d) = (1 + 0.5d)d 

Refer the feedback row-vectors to the plant canonical reachable representation. Being the plant 
minimum-phase, we expect that, for small p and any T, TCI globally converge to Flq = -Fee = 
[ 0.74 0.5 ] . Table 1 shows TCI convergence feedback row-vectors for p = 0.15 and zero initial 
feedback. The row labelled #i reports the number of iterations required to achieve convergence. 
Convergence is claimed at the fc-th iteration if k is the smallest integer for which H-Ffc+i — -Ffe||oo < 
10~ 5 . Tabic 2 reports some computer analysis results showing that TCI are insensitive to the 
feedback Fo, even if the latter makes the closed-loop system unstable. In Table 2 <3>o = ® + GFo 
indicates the initial closed-loop transition matrix. All the results refer to T = 4 and p = 10~ 5 . 
Because of such a small value of p, Flq is the indistinguishable from -Fee- 

If the plant is nonminimum phase, more complications arise. We discuss qual- 
itatively the situation, assuming that p > is small enough so as to make F$s — 
Fee, and Flq = Fscc, where Fss and Face denote the Single Step Regulation 
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Table 5.7-2: TCI convergence feedback-gains for the plant of Example 2, T = 4, 
p = 10~ 5 and various initial feedback row- vectors Fq. F denotes TCI convergence 
feedback. 

feedback and, respectively, the Stabilizing Cheap Control feedback. For T = l+£, 
TCI yield Fss- For higher values of T, TCI acquire a second possible converging 
point close to F$cc ■ As T increases, the Fscc domain of attraction expands, while 
the one of Fss shrinks. As the size of the latter becomes smaller than the available 
numerical precision, global convergence close to i*scc m experienced. 

Example 5.7-3 Consider again the open-loop unstable nonminimum-phase plant of Example 
1 with a = 2. TCI are run for p = 0.1 and various initial feedback— gains and prediction horizons 
T. Fig. 3 exhibits TCI convergence closed-loop eigenvalues as a function of T for high (curve 
(h)) and low precision computations (curve (I)). The latter is obtained from the first by rounding 
off -Ffc beyond its third significant digit at each iteration. All TCI results of Fig. 3 are obtained 
for the most unfavourable initialization, viz. by choosing Fq close to -Fss- Note that in the high 
precision case, TCI require a prediction horizon larger than or equal to 16 to converge close to 
Flq. In the low precision case, such a horizon is more than halved to 7. Fig. 4 exhibits results 
similar to the ones in Fig. 3 pertaining to high precision computations and the most favourable 
initialization Frj = -Flq. Note that, for such an initial feedback, prediction horizon larger than or 
equal to 3 allow TCI to converge close to Flq. 

As we see from Example 3, for a given plant there exists a critical value of T, call 
it T* , such that for all T > T* TCI converge close to Flq. T* depends upon 
the precision of computations as well as the open-loop plant zero/pole locations. 
The reason is that each unstable closed-loop eigenvalue associated to Fss — Fee, 
viz. approximately each unstable plant zero, must give a significant contribution 
to Jt* ■ We express this property by saying that each unstable plant zero must 
be well detectable within the critical prediction horizon T* . For a second order 
nonminimum-phase plant with A(d) = \-\-a1d-\-a2d 2 and B(d) = b\(l+ f3d)d, \(3\ > 
1, detectability of the [3 mode within T* increases with \(3 T " det6| (Cf. Problem 
3). Here det6 = b\ (/3 2 — a\(i + a^) equals the determinant of the observability 
matrix of the reachable canonical state representation of the plant. Note that 
an almost pole/zero cancellation yields a small value of | det©| and, hence, makes 
detectability only possible for very large values of T. E.g., for the plant of Example 
3 we find | det 0| = 10~ 6 . The closeness of the double pole in 2 to the zero in 1.999 
is responsible for such a small value and the corresponding large critical prediction 
horizon T* = 14. Next example shows that if | det 0| is increased to 2.74 by moving 
the former double pole to 2± j0.7 the critical prediction horizon T* decreases to 4. 
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Figure 5.7-3: TCI closed-loop eigenvalues for the plant (16) with a = 2 and 
p = 0.1, when: high precision (h) and low precision (I) computations are used. 
TCI are initialized from a feedback close to F$s ■ 
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Figure 5.7-4: TCI closed-loop eigenvalues for the plant (16) with a — 2, p = 0.1 
and high precision computations. TCI are initialized from -Flq- 
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Figure 5.7-5: TCI closed loop eigenvalues for the plant of Example 4 and p = 0.1, 
when high precision computations are used. TCI are initialized from a feedback 
close to Fss- 



Example 5.7-4 Consider the open-loop unstable nonminimum-phase plant A(d)y(t) = B(d)u(t) 
with 

A(d) = 1 - 4d + 4.49rf 2 B(d) = d - 1.999d 2 

As in Example 3, we set p = 0.1. With reference to feedback-gains pertaining to the plant 
canonical reachable state representation, TCI are computed for various initial feedback-gains and 
prediction horizon T. The results are reported in Fig. 5 under the same conditions of Fig. 3, case 
(h). 

For all the examples considered so far the eigenvalue of the LQ regulated sys- 
tem approximately equals 0.5 and, for the resulting values of T* , turns out that 
|0.5 T | < 0.1. This implies that truncation after T* of the performance index does 
not remove Flq from the possible converging points of TCI. In more general situa- 
tions, it is not granted that good detectability within T of open-loop unstable zeros 
implies that |A^~ | « 0.1, Am being the eigenvalue of the LQ regulated system 
with maximum modulus. Consequently, in order to let TCI converge close to Flq 
from all possible initializations, T* must be large enough so as to make open-loop 
unstable zeros well detectable and, at the same time, guarantee that |A M \ -C 0.1. 

Example 5.7-5 Consider the open-loop unstable nonminimum-phase plant A(d)y(t) = B(d)u(t) 
with 

A(d) = 1 - Ad + 4.49d 2 B(d) = d - l.Old 2 
Here, Stabilizing Cheap Control yields a closed-loop eigenvalue approximately equal to 0.99. For 
such a plant Fig. 6 reports convergence TCI feedback— gains for p = 10 _1 and p = 10 -3 . We see 
that since the closed-loop eigenvalue Ajj is closer to one in the latter case, TCI require a much 
larger T to converge close to Flq — ^SCC f° r P = 10~ 3 than for p = 10 _1 . 

The above qualitative considerations on TCI pertain to vanishingly small values 
of p. However, they can be extended mutatis mutandis to any possible p > by 
considering that for T = 1 + i TCI yield Fss, the single step regulation feedback, 
and, as T increases, TCI acquire another possible convergence point close to Flq, 
the LQOR feedback. As T increases, the Fss domain of attraction shrinks, while 
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Figure 5.7-6: TCI feedback-gains for p = 10 1 and p = 10 3 , for the plant of 
Example 5. 



the one of Flq expands. As the size of the first becomes smaller than the available 
numerical precision, global convergence close to F^q is experienced. 

Problem 5.7-3 Consider the SISO plant (4-20) under Cheap Control regulation u(t) = 
Fcc s (t) + v(fy with s(t) as in (6-12), where r)(t) plays the role of an exogenous variable. Let 
H yv (d) and H uv (d) be the closed-loop transfer functions from rj(t) to y(t) and, respectively, u(t). 
Find 

H yv (d) = d 1+l b! and H uv (d) = b t ^§- 

B{d) 

Conclude that for A{d) = 1 + aid + a,2d 2 and B(d) = b±(l + f3d)d, |u>£_|_^|, being the 

sample of the impulse response associated with H uv (d), for k > 2 equals \/3 k ~ 2 det ©/6j|, with 
dote = fef(/3 2 - oi/3 + a 2 ). 



TCI Computations 

There are better ways to carry out TCI than using (20) and (22). The starting point 
is to see how to embody the constraints (18) in the «-step ahead output evaluation 
formula (5-5). This is rewritten here for a generic MIMO plant (<&, G, H) with state 
vector x(t) € M" 

y(i) = wxu{i - 1) + • • • + tUiu(O) + 5^(0) (5.7-23) 

Wi := H&^G Si := H& (5.7-24) 

We recall that with TCI the problem is to find, given F^, the next feedback-gain 
matrix Fk+i in accordance with the RHR problem (17) and (18). We consider 
this problem for ip x — H'^pyH and y(i) — Hx(i). This is a simple optimization 
problem in that the only vector to be chosen is the first in the sequence M[o,t) ; all 
the remaining vectors being given by (18). We point out that, taking into account 
(18), (23) can be rewritten as follows 



y(i) = 9 i (k)u(0)+r i (k)x(0) i=l,2,---,T (5.7-25) 
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similarly, 

u(i - 1) = fJti(k)u(0) + Ai(k)x(0) (5.7-26) 

with 

m(k) = I m and Ai(fc) = O mx „ (5.7-27) 

Problem 5.7-4 Find recursive formulae in the index i to express the matrices 8i(k), Ti(fc), 
fj,i(k), Ai(fc) in terms of (<3?,G,-ff) and the feedback-gain matrix F^. 

It is worth to point out the substantial difference between (23) and (25). Unlike 
(23) where explicitly appears, (25) depends only implicitly on via the 

matrices 9i(k) and I\(fc). These are in fact feedback-dependent. The same holds 
true for (26). In order to underline this property we will refer to (25) as the closed- 
loop i-step ahead output evaluation formula, and to (26) as the closed-loop (i — 1) 
step ahead input evaluation formula. Further, (25) and (26) will be referred to as 
output and, respectively, input many steps ahead evaluation formulae, whenever no 
specification of a particular step is desired or needed. 

Once (25) and (26) are given, it is a simple matter to find the desired feedback 
updating formula. 

Proposition 5.7-2. Let the closed-loop many steps ahead output and input eval- 
uation formulae be given as in (25) and, respectively, (26) when the feedback F k is 
used. Then, the next TCI feedback is given by 

T 

F k+1 = -E^ 1 Y / [(>m^ l (k)+^(k)^uA l (k)} (5.7-28) 
»=i 

T 

E k := 2[^(fc)^„(9 i (fc)+/i , i (fe)ta(fc)] (5-7-29) 

i=l 

We note that, by virtue of (27), the to x to matrix E k is nonsingular, irrespective 
of Fk, whenever ip u > 0. 

Problem 5.7-5 Verify the TCI feedback updating formula (28). 

A convenient procedure for computing the matrices Oi(k), Tj(fc), /ij(fc) and Aj(fc) 
in (25) and (26) is now discussed. This will be referred to as the closed-loop system 
response procedure. It is the counterpart in TCI regulation of the plant response 
procedure used in SIORHR and GPR. Let 

S%(i,x(0),u(0)) and S% (i,x(0),u(Q)) (5.7-30) 

denote the plant output, and, respectively, input response at time i to the input 
u(0) at time 0, from the state a;(0) at time with the inputs given by the 

time-invariant state-feedback 

u(j) = F k x(j) , j = 1,- •• ,i- 1 

Let 9i^ r (k) denote the r th column of 6i(k) 



0i{k) = [ 0j,i(fc) 



(5.7-31) 
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Figure 5.7-7: Realization of the TCI regulation law via a bank of T parallel 
feedback-gain matrices. 

Let us adopt similar notations to denote the columns of the other matrices of 
interest. Then, the following equalities can be drawn from (25) and (26) 



9i, r (k) = S y k (i,O n ,e r eTR m ) (5.7-32) 

fM, r {k) = 5^(i-l I O n ,e r eR ro ) (5.7-33) 

IV(fc) - 5f(i,e r eR",0 m ) (5.7-34) 

A iir (k) = 5^(i-l,e r e]R",O ro ) (5.7-35) 



where e r denotes the r-th vector of the natural basis of the space to which it 
belongs. 

Proposition 5.7-3. The columns of the matrices which, for a given feedback F^, 
parameterize the closed-loop, many steps ahead input/output evaluation formulae 
(25) and (26), can be obtained by running the plant fed back by Fk as indicated in 
(32)-(35). 

Problem 5.7-6 Show that, using (28) with F := -Ffc +1 , the TCI feedback can be rewritten as 
follows 

T 

F = Y. Ui fi (5.7-36) 
i=l 

with 

T 

n i = J m (5.7-37) 

Express ITi and fi in terms of 6i, Ti, fii and Aj. Note that by (35) the TCI regulation law 
u(t) = Fx{t) is realizable by a bank of T parallel feedback-gains as in Fig. 7. 

Main points of the section Modified Klcinman iterations yield at convergence 
the steady-state LQOR feedback. Though they can have a rate of convergence 
close to that of standard Klcinman iterations, they are not jeopardized by an ini- 
tially unstable closed-loop system, and do not require the solution of a Lyapunov 
equation at each step. 

Computer analysis is used to study convergence properties of TCI. This study 
reveals that there exists a critical prediction horizon T* such that for all T > T* 
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TCI converge close to Flq . Such a critical horizon depends on both the open-loop 
unstable zero/pole locations and the largest time constant of the LQ regulated 
system. 

Under TCI setting, plant outputs and inputs can be expressed by the closed- 
loop many steps ahead evaluation formulae (25) and (26) whose parameter matrices 
are feedback-dependent. These closed-loop parameter matrices allow one to update 
the feedback-gain by the simple formula (28). 



5.8 Tracking 

We study how to extend the RHR laws of the previous sections so as to enable the 
plant output y(t) to track a reference variable r(t). The aim is to modify the basic 
RH regulators so as to make the tracking error 

e(t) := y(t) - r(t) (5.8-1) 

small or possibly zero as t — > oo, irrespective of the initial conditions. The reader 
is referred to Sect. 3.5-1 for the necessary preliminaries. 



5.8.1 1-DOF Trackers 

Every stabilizing RHR law can be modified in a straightforward manner so as to 
obtain a 1-DOF tracker insuring, under standard conditions, asymptotic tracking. 
Of paramount interest in applications is to guarantee asymptotic tracking for con- 
stant references as well as asymptotic rejection of constant disturbances. This can 
be done as follows. 

Let the plant be as in (4-20). Next, the model (3.5-3) for a constant reference 
is given by 

A(d)r(t) = 
A(d) 

Hence, following the procedure which led us to (3.5-4), we find 



= \ 
:= 1 - d j 



(5.8-2) 



A(d)A(d)e(t) = B(d)8u(t) := B(d)6u(t - £) 

Su(t) := u(t)-u(t-l) } (5.8-3) 

B(l) ± 

This is the new plant to be used in the RHR law synthesis. In particular, the 
resulting control law is of the form 

Su(t) = Fs(t) (5.8-4) 

a(t)= [ (e\- na )' (5u\-_[- nb+1 )' ]' (5.8-5) 

We note that (4) can be rewritten as a difference equation 

R(d)Su(t) = -S(d)e(t) (5.8-6) 

with R(d) and S(d) coprime polynomials in the backward shift operator d such 
that R(0) — 1, dR(d) < I + — 1 and dS(d) < n a . Provided that the closed-loop 
system (3) and (6) is internally stable, viz. A(d)A(d)R(d) + B(d)S(d) is strictly 
Hurwitz, from Theorem 3.5-1 it follows that, thanks to the presence of the integral 
action in the loop, both asymptotic tracking of constant references (set-points) and 
asymptotic rejection of constant disturbances are achieved. 
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5.8.2 2-DOF Trackers 

The starting point of our discussion on 2-DOF trackers is to begin with a plant 
model as in (3), where the input increment Su(t) appears as the new input variable, 
so as to have an integral action in the loop. 

We first show how to design 2-DOF LQ trackers, and, next, how to modify the 
various RHR laws so as to obtain 2-DOF trackers equipped with integral action. 
The reason to start with LQ tracking is that the related results clearly reveal the 
improvement in tracking performance that can be achieved by independently using 
the reference variable in the control law. Next example, in which 1-DOF and 2 
DOF controllers are compared, shows that such an improvement may turn out to 
be quite dramatic. 

Example 5.8-1 Consider the discrete-time open-loop stable nonminimum-phase plant A(d)A(d)y(t) = 
B(d)5u(t) with 

A(d)A(d) = 1 - 1.8258d + 0.8630d 2 - 0.0376(2 3 + 0.0004d 4 
B(d) = 0.1669(2 + 0.3246d 2 - 0.6832d 3 - 0.0192(2 4 

This is obtained from the following continuous-time open-loop stable nonminimum— phase plant 

(s + 1)(1 + ^-s) 2 y(r) = o, 2 - 1)u(t - 0.2) 
15 

where t£E and s denotes the time-derivative operator, viz. sy(r) = dy(r)/dT, by sampling the 
output every T s = 0.25 seconds and holding the input constant and equal to u(t) over the interval 
[tT s , (t + 1)T S ]. Fig. 1 and Fig. 2 show the reference, a square wave, along with the plant output, 
when the plant input is controlled by a 1-DOF and, respectively, a 2— DOF, LQ tracker. Both 
trackers are optimal w.r.t. the performance index 

oo 

J2 {e 2 (k) + 5Su 2 (k)} (5.8-7) 

fc=0 

Further, according to the pertinent results in next Theorem 1, the 2-DOF LQ tracker exploits 
the knowledge of the reference over 15 steps in the future. 

LQ Trackers We study how to modify the pure LQ regulator law so as to single 
out 2-DOF controllers. There are standard ways, [KS72], [AM90] and [BGW90], to 
do this via the Riccati-based approach. However, our intention is to use the poly- 
nomial equation approach of Chapter 4, being ultimately interested in exploiting 
the results in an I/O system description framework. 
We consider a MIMO plant 

A(d)A(d)y(t) = B(d)Su(t) (5.8-8) 

with A(d) = (1 - d)I p , A(d) and B{d) left coprimc, and detS(l) ^ 0. The last 
two conditions are equivalent to say that A(d)A(d) and B(d) are left coprimc. In 
(8) we adopt the usual choice of dealing with input increments 5u(t), in place of 
inputs u(t), so as to introduce an integral action in the feedback control system. 
We assume that a state-vector x(t), dimx(i) = n x is chosen to represent (8) via a 
stabilizable and detectable state-space representation as in (4.1-1). E.g., similarly 
to (5-1), such a state vector can be made up by I/O pairs. Consider next the 
steady-state LQOR problem (4.1-l)-(4.1-4). By (4.4-24), its solution is of the 
form XSu(t) = —Yx{t). We consider the following modified version of such a 
regulation law 

X6u(t) = -Yx(t) + v(t) (5.8-9) 
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Figure 5.8-2: Reference and plant output when a 2-DOF LQ controller is used 
for the tracking problem of Example 1 . 
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In (9) v(t) G R m has to be chosen, so as to make the tracking error (1) small 
in a sense to be specified, amongst all R m -valued functions of £[o,t]j U[o,t) an d 
»•(■) - r [0 ,oo) 

=p(ar[ 0)t ],5u[o,t),r(-)) (5.8-10) 

It follows from (10) that (9) can be any linear or nonlinear 2-DOF controller, causal 
from x(t) to Su(t). On the other hand, (9) is permitted to be anticipative from 
r(t) to 5u(t), the whole reference sequence r(-) being assumed to be available to 
the controller at any time. The problem is to choose v(t) in such a way that the 
feedback control system is stable and the performance index 

J:=\\e(-)\\l y + \\Su(-)\\l u (5.8-11) 

is minimized. In order to make the problem tractable in a fully deterministic 
framework, we stipulate that the plant state is initially zero 

x(0) = O nx (5.8-12) 

Further, it is assumed that 

IK) ||^ < oo (5.8-13) 

We recall that, since (X, Y) solves the steady-state LQOR problem with perfor- 
mance index (11) for r(t) = O p , according to (4.3-9) and (4.1-12) we have 

XA 2 (d) + YB 2 (d) = E(d) (5.8-14) 

E*(d)E{d) = A* 2 {d)tl) u A 2 {d) + B* 2 {d)H'ij y HB 2 {d) (5.8-15) 

with HB 2 {d)A 2 l (d) = A- 1 (d)A- 1 (d)B(d), A 2 (d) and B 2 (d) right coprime, and 
E(d) assumed here to be strictly Hurwitz. Using the ^-representations of the 
involved sequences, from (8) and (9) we find 

x(d) - [I nx +B 2 {d)A' 2 1 {d)X- 1 YY 1 B 2 {d)A' 2 1 {d)X- 1 v{d) 

= {l ni +B 2 {d)[E{d)-YB 2 {d)]- 1 Yy\ 

B 2 {d)A- 2 1 {d)X- l v{d) [(14)] 

By the Matrix Inversion Lemma (3-16), we get 

x{d) = [/„. - B 2 (d)E- 1 (d)Y] B 2 (d)A^(d)X- 1 v(d) 

which, by (14), becomes 

x(d) = B 2 (d){l-E- 1 (d)[E(d)-XA 2 (d)}}A^ 1 {d)X- 1 v(d) 

= B 2 {d)E- 1 {d)v{d) (5.8-16) 

By similar arguments, we also find 

5u{d) = A 2 (d)E- 1 (d)v(d) (5.8-17) 

Then, using (3.1-12), 

J = (e* (d) i})y e(d) + 6u* (d) ip u 8u (d) ) 
= (y* (d)%p y y(d) + fru* (d)tp u Su(d) + 

f*(d)^ y r(d) - y*m v r(d) - r*(d)i, y y(d)) 
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Now the sum of the first two terms in the latter expression equals 

([HB 2 {d)E-\d)v{d)]* i> y HB 2 (d)E- 1 (d)v(d) + 
[A 2 (d)E- 1 (d)v(d)] * ^ u A 2 {d)E- 1 {d)v{d)) 

= (v*(d)v(d)) [(15)] 

In conclusion, the quantity to be minimized w.r.t. v(d) becomes 

\\v(d)-E-*(d)B* 2 (d)H^ y f(d)\\ 2 

Hence, the optimal choice turns out to be 

v(d) = E-*(d)B^(d)H'ip y r{d) (5.8-18) 

or, in terms of the impulse response matrix of the strictly causal and stable transfer 
matrix HB 2 {d)E~ 1 (d), 



HB 2 (d)E~ 1 (d) = J2 h '( i ) di 



(5.8-19) 



v{t) = Y^h{i)i> v r{t + i) 



(5.8-20) 



i=i 



We see that the optimal feedforward term v(t) in the 2-DOF LQ tracker (2) turns 
out to be a linear combination of future samples of the reference to be tracked. 
Recalling (4.2-2), (18) can be equivalently written as 



v{d) = E- 1 (d)B 2 (d)H'tP y f(d) 



(5.8-21) 



The reader is warned not to believe that, thanks to (21), (9) can be reorganized as 
follows 

E(d)X8u(t) = -E(d)Yx(t) + B 2 (d)H'i> y r{t) (5.8-22) 

In fact, it is easily seen that, being E(d) anti-Hurwitz, (22) yields an unstable 
closed-loop system. We insist on underlining that the correct interpretation of 
(21) is to provide a command or feedforward input increment in terms of a linear 
combination of future reference samples. 



Example 5.8-2 Consider a SISO nonminimum— phase plant with HE>2(d) = d(l — bd), \b\ > 1, 
and the performance index (11) with ip u = and xp y = 1. Then, for r(t) = 0, Stabilizing Cheap 
Control results. Hence 



E{d) = k (1 - b- 1 d) , k : = 

The optimal feedforward input increment (18) equals 



l + b 2 



1 + 6- 2 



1/2 



. 1 - bd~ Y ) d- 1 , s 

v(d) = k- 1 ^ ; : f(d) 



(5.8-23) 



(5.8-24) 



v(t) = k~ 



r(t + l) + (l-b 2 )Y / b~ir(t + j + 1) 

3 = 1 



(5.8-25) 
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Problem 5.8-1 Consider the same tracking problem as in Example 2 with the exception 
that \b\ < 1, viz. the plant is here minimum-phase. Show that, instead of (25), here we get 
v(t) = fc- 1 r(t + l). 

Problem 1 and Example 2 point out the different amount of information on the refer- 
ence required for computing the feedforward input v(t) in the minimum-phase and, 
respectively, the nonminimum-phase plant case. 

While in the first case only r(t + 1) is needed at time t, the latter case requires the 
knowledge of the whole reference future, viz. r[ t+ i ;(X) ). Hence, we can expect that 
exploitation of the future of the reference can yield significant tracking performance 
improvements in the nonminimum-phase plant case. For instance, for the tracking 
problem of Example 1 it can be found that setting the reference future equal to 
the current reference value, the 2-DOF LQ tracking performance deteriorates from 
that in Fig. 2 to approximately the one of Fig. 1. 

Problem 5.8-2 Show that for a square plant (rre = p) the 2-DOF controller (9) and (18) yields 
an offset-free closed-loop system, provided that no unstable hidden modes are present. [Hint: 
Use (16) and (15). ] 

Despite that (18) is obtained under the limitative assumption (12), it is reassuring 
that the 2-DOF LQ control law has the form (9). In fact, the latter shows that 
if r(t) = O p the controller acts as the steady-state LQOR, and hence, counteracts 
nonzero initial states in an optimal way. The generic situation of nonzero initial 
state and nonzero reference is not included in our result. Further, it appears diffi- 
cult to remove (12) in the present deterministic framework. Eq. (18) will be fully 
justified in Sect. 7.5 by reformulating the problem in a stochastic setting. Autho- 
rized by this perspective, we refer since now to (9) and (18) as the 2-DOF LQ 
control law. 

Theorem 5.8-1. (2 DOF LQ Control) The 2-DOF control law minimizing 
(11) for a finite energy reference and a square plant with zero initial state is given 
by 

XSu(t) = -Yx{t) + v(t) (5.8-26) 

where X and Y are constant matrices in (14) solving the pure underlying LQOR 
problem and v(t) is the command or feedforward input 

v{d) = E-*{d)B*{d)H'ip y f{d) (5.8-27) 
= E-\d)B 2 {d)H'ip y r{d) 

Provided that E(d) is strictly Hurwitz and the modified plant with input 5u(t) and 
output y(t) is free of unstable hidden modes, the 2-DOF LQ controller yields, thanks 
to its integral action, asymptotic rejection of constant disturbances and an offset- 
free closed-loop system. 

Theorem 1 can be used so as to single out a 2 DOF controller based on MKI. 
The same holds true for TCC, the Truncated Cost Control, the 2-DOF tracking 
extension of TCI. While in the first the underlying pure regulation problem coin- 
cides with LQOR, this is still essentially true in the second case provided that the 
prediction horizon is taken large enough. Nevertheless, both MKI and TCI yields 
the LQOR law 5u(t) — Fx(t), where clearly F = —X~ 1 Y. The problem here is to 
determine the 2-DOF control law 



Su = Fx(t) + X^v(t) 



(5.8-28) 
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by computing X~ 1 v(t) directly from the feedback-gain matrix and knowledge of 
the plant, without the need of solving the spectral factorization problem (15). To 
this end, rewrite (14) as follows 

Q(d) := X^Eid) = A 2 (d) - FB 2 (d) (5.8-29) 

Then, since A 2 (l) = O p and, by Theorem 1, the closed-loop system is offset-free, 
we get 

X^vid) = X- 1 Q-*(d)(X')- 1 B* 2 (d)H'^ y f(d) (5.8-30) 
X- 1 = -FB 2 (l) [HB^l)}- 1 ip- 1 (5.8-31) 

where ip' y ip y = ip y . 

Problem 5.8-3 Verify (30) and (31). 

Eqs. (29)-(31) allow us to compute the 2-DOF control law (28) without the need 
of solving the spectral factorization problem (15), by only using the plant model 
and the feedback F. 

SIORHC is an acronym for Stabilizing I/O Receding Horizon Controller which 
is the 2-DOF tracking extension of SIORHR. SIORHC is obtained by modifying 
(4-25)-(4-28) as follows. Given 



(5.8-32) 



consider the problem of finding, whenever it exists, an input sequence Su^ t t+ T) to 
the SISO plant 

A(d)A(d)y{t) = B{d)Su{t) (5.8-33) 



minimizing 



t+T-l 



j(s(t),Su [tit+T) ) = [^y£ 2 (k + l)+^ u 5u 2 (k)] 



under the constraints 

^ u t+T+n-2 = On-1 i 

with 
and 



vlXlXUn-^dt + T + l) 

n := max {n a + 1, n^} 
r(k) 



r{k) := 



r(k) 



> n 



Then, the plant input at time t given by SIORHC equals 

Su(t) = 6u(t) 



(5.8-34a) 

(5.8-34b) 
(5.8-35) 

(5.8-36) 
(5.8-37) 



The solution of problem (32)-(36) can be obtained mutatis mutandis as in Sect. 5 
in the following form 



Su 



t+T-l 



-M- l ^p v (I T - QLM- 1 ) W[ [T lS (t) 
Q [T 2 s(t) -r(t + £ + T)] } 



r t+i+i 1 
t+e+T-il 



(5.8-38) 
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with M, Q, L, Ti and T 2 as in (5-21). Hence, the SIORHC law equals 

6u(t) = ei^u*^.! (5.8-39) 
which, in turn, can be also written in polynomial form as follows 

R(d)Su(t) = -S(d)y(t) + Z* (d)r(t + 1) (5.8-40) 
In (40) R(d) and S(d) arc polynomials similar to the ones in (6) and 

Z*(d) = zxd- 1 + ■ ■ ■ + z T d- T (5.8-41) 

Problem 5.8-4 Verify that (38) solves the problem (32)-(36). 

We note that the stabilizing properties of (39) or (40) can be directly deduced from 
the ones of SIORHC in Theorem 5-1, taking into account that the initial plant 
A(d) polynomial has now become A(d)A(d). Further, we show hereafter that (33) 
controlled by SIORHC is offset free whenever the closed-loop system is stable. 
Using (33) and (40) we find the following equation for the controlled system 



[A(d)A(d)R(d) + B(d)S(d)] 



' y(t) ' 




Su{t) 





B{d)Z*{d) 
A(d)A(d)Z*(d) 



r(t + 1) (5.8-42) 



Hence, provided that A(d)A(d)R(d) + B(d)S(d) is strictly Hurwitz, if r(t) = r, we 
have y :— lim^oo y(t) = s ^ r and Su := lim^oo Su(t) = 0. In order to prove 
that S(l) = Z*(l), comparing (38) and (40) we show that every row of Ti and I^ 
has its first n entries which sum up to one. To see this, we consider an equation 
similar to (5-27) in the present context 

1 = A(d)A(d)Q k (d) + d k G k {d) 1 

dQk(d) <k-l J ' 

For d = 1 , we get 

G fe (l) = 1 (5.8-44) 

Hence, the coefficients of the polynomials Gk(d) sum up to one. Finally the desired 
property follows, since, by (5-8) and (5-28), the first n entries of the rows of Ti and 
T2 coincide with the coefficients of Gk(d). 

The above results are summed up in the following theorem. 

Theorem 5.8-2 (SIORHC). Under the same assumptions as in Theorem 
with A(d) replaced by A(d)A(d), the SIORHC law 

Su(t) = -eiM" 1 !^ {It - QLM^ 1 ) W[ [T lS {t) - r t '+^] 

+Q[T 2 s(t) -r(t + e + T)}} (5.8-45) 

where s(t) is as in (32), inherits all the stabilizing properties of SIORHR. Further, 
whenever stabilizing, SIORHC yields, thanks to its integral action, asymptotic re- 
jection of constant disturbances, and an offset-free closed-loop system. 
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GPC stands for Generalized Predictive Control, the 2-DOF tracking extension 
of GPR. GPC is obtained by modifying (6-12)-(6-16) as follows. Given s(t) as in 
(32), find an input sequence ^U[t,i+jv u ] to the plant (33) minimizing for N u < N 2 



t+N 2 t+N u -l 

Jgpc= z 2 (k + e)+^ u £ Su 2 (k) 

k=t+Ni 



k=t 



under the constraints 

Su t+N2-1 = °N 2 -N 1 

' r(t + £+N u ) 
r£££»=r(t + * + JV u ):= : 

_ r(t + £ + Nu) 

Then, the GPC plant input at time t is given by 

Su(t) = 6u(t) 



N 2 -N u + 1 



(5.8-46) 

(5.8-47) 
(5.8-48) 

(5.8-49) 



A remark on (48) is in order. It is seen that (48) is a constraint on the reference to be 
tracked. Consequently, (48) should not be included in the GPC formulation but, on 
the contrary, it should be fulfilled by the reference itself. On the other hand, to hold 
for all t, (48) implies, for N u < N 2 , that the reference is constant: a contradiction 
with our goal to use 2-DOF controllers to get high performance tracking with 
general reference sequences. The correct interpretation for (48) is that, whatever 
the reference future behaviour, the controller pretends that the reference is constant 
from time t + £ + N u throughout t + t + N 2 . This assumption, being consistent 
with the input constraints (48), will be referred to as the reference consistency 
constraint. Taken into account (35), we note that such a constraint is embedded in 
SIORHC formulation. As next example shows the reference consistency constraint 
is important for insuring good tracking performance. 



Example 5.8-3 Consider the SISO plant A(d)A(d)y(t) = B(d)Su(t) with 

A(d) = 1 +0.9<2- 0.5d 2 B(d) = d+ l.Old 2 

SIORHC with T = 3 and GPC with Ni = N u = 3, N 2 = 5 and p = 0, both yield a state-deadbeat 
controller whose tracking performance, when the reference consistency condition is satisfied, is 
shown in Fig. 3. Fig. 4 shows that, if the reference consistency condition is violated, the tracking 
performance becomes unacceptable. 

Following developments similar to (6-17)-(6-23), we find 

si + N u -i = -MG 1 W^\r G s(t)-f(t + e,N 1 ,N u )] (5.8-50) 

t+l+Nt 



1 t+e+N u -i 
r(t + £+N u ) 



N 2 -Nx + l (5.8-51) 



By the same arguments as in (43) and (44), it follows that every row of Tq has 
its first n entries which sum up to one. Hence, as with SIORHC, GPC yields 
zero-offset. 



Problem 5.8-5 Verify that (50) solves problem (46)-(48). 
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Figure 5.8-3: Deadbeat tracking for the plant of Example 3 controlled by SIORHC 
(or GPC) when the reference consistency condition is satisfied. T — 3 is used for 
SIORHC (JVi = N u = 3, N 2 = 5 and tp u = for GPC). 





w r4 


/ 














Figure 5.8-4: Tracking performance for the plant of Example 3 controlled by GPC 
(Ni = N u = 3, N2 = 5 and ip u = 0) when the reference consistency condition is 
violated, viz. the time- varying sequence r(t + N u + i), i = 1, ■ • • , N% — N u , is used 
in calculating u(t). 
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Theorem 5.8-3 (GPC). Under the same assumption as in Theorem 6-1 with 
A(d) replaced by A(d)A(d), the GPC law given by 

Su(t) = -^M^W'g [T G s(t) -f(t + t,N lt N u )] (5.8-52) 

inherits the state- deadbeat property of GPR. Further, whenever stabilizing, GPC 
yields an offset -free closed-loop system and, thanks to its integral action, asymptotic 
rejection of constant disturbances. 



5.8.3 Reference Management and Predictive Control 

In many cases it is convenient to distinguish between the reference sequence r(-) 
used in the control laws and the desired plant output w(-). An example is to let 
r(-) be a filtered version of w(-). 

r(t) = Y^T^W (5-8-53) 

or 

H(d)r(t) = w{t) (5.8-54) 

with H(d) = e _1 [l - (1 - e)d], H(l) = 1, and e such that < e < 1 and small for 
low-pass filtering w(t), e.g. e = 0.25. For 2-DOF trackers based on RHR, another 
possibility to make smooth the transition from the current output y(t) to a desired 
constant set-point w is to let 



r(t) = y(t) 
r(t + i) = (1 - e)r(t + i - 1) + ew 



(5.8-55) 



In designing 2-DOF trackers a different approach consists of leaving w(-) unaltered, 
and, instead, filtering ?/(•) and u(-). Viz., the performance index (11) is changed in 

J=\\VH(-)-w(-)\\l t + \\5u H (-)\\l u (5.8-56) 

where yjj(-) and uh(-) are filtered versions of y(-) and, respectively, u(-) 

y H (t) = H(d)y(t) u H (t) = H(d)u(t) (5.8-57) 

with H(d) a strictly Hurwitz polynomial such that H(l) = 1. Note that, taking 
into account that the plant can also be represented as 

A(d)A(d)y H (t) = B(d)Su H (t) 

we now find for the 2-DOF LQ control law, instead of (26), 

XSu H (t) = -Yx H (t) + v(t) 

withu(t) given again by (27), v(d) = E~* (d)B^(d)H / Tp y w(d), andx H (t) = H{d)x(t). 
Consequently 

XSu(t) = -Yx{t) + 

Hence, we conclude that filtering y(t) and u(t) as in (56) and (57) has the effect 
of leaving everything unchanged with the exception of changing w(t) into r(t) = 
w(t)/H(d). In other terms, in the present deterministic context, (56) and (57) are 
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equivalent to directly filtering the desired plant output w(t) as in (54) to get the 
reference r(t). As will be shown in Sect. 7.5, the approach of (56) and (57) provides 
us with some additional benefits should stochastic disturbances be present in the 
loop. 

1-DOF or 2-DOF tracking based on the receding horizon method is referred 
to as Receding Horizon Control (RHC). This reduces to RHR when the reference 
to be tracked is identically zero. The name Multistep Predictive Control or Long- 
Range Predictive Control has been customarily used in the literature to designate 
2-DOF RHC whereby the control action is selected taking into account the fu- 
ture evolution over a multi-step prediction horizon of both the plant state and the 
reference as provided by "predictive" dynamic models. From the results of this sec- 
tion, particularly Theorem 1, it follows that 2-DOF LQ control satisfies the above 
characterization and hence will be regarded as a long-range predictive controller 
as well. For the sake of brevity, unless needed to avoid possible confusion, from 
now on we shall often omit the attribute "Multistep" or "Long-Range" and simply 
refer to the above class as Predictive Control. 

A peculiar and important feature in Predictive Control is that the future evo- 
lution of the reference can be designed in real-time (Cf. (55)). This can be done, 
taking into account the current value of the plant output y(t) and the desired set- 
point w, so as to insure that the plant input u(t) be within admissible bounds and, 
hence, avoid saturation phenomena. However, this mode of operation, whereby 
r{t + i) is made dependent on y(t) as in (55), introduces an extra feedback loop 
which must be proved not to destabilize the closed-loop system. 

Main points of the section 1-DOF and 2-DOF controllers can be designed 
by suitably modifying the basic RHR laws so as to insure asymptotic rejection 
of constant disturbances and zero offset. In constrast with 1-DOF controllers, in 
2-DOF controllers the reference to be tracked is processed, independently of the 
plant output which is fed back, by a feedforward filter so as to enhance the tracking 
performance of the controlled system. Predictive control is a 2-DOF RHC whereby 
the feedforward action depends on the future reference evolution which, in turn, 
can be selected on-line so as to avoid saturation phenomena. 

Notes and References 

At the beginning of the seventies two successive papers, [Klc70] and [Klc74], pro- 
posed a simple method to stabilize time-invariant linear plants. This method was 
later adopted in [Tho75] by using the concept of a receding horizon. [KP75] and 
[KP78] extended RHR to stabilize time-varying linear plants. See also [KBK83]. 
It took longer than fifteen years [CM93] to extend the stabilizing property of zero 
terminal state RHR to the case of a possibly singular state transition matrix as 
in Theorem 3-2. In a different direction, [TSS77], [Sha79] and [CS82] considered 
nonlinear state-dependent RHR for time-invariant linear plants so as to speed up 
the response to large regulation errors. Extensions of RHR to stabilize nonlinear 
systems were reported in [MM90a], [MM90b], [MM91a], and [MM91b]. 

The concept of Predictive Control, wherein on-line reference design takes place, 
first appeared in [Mar76a] and [Mar76b]. Subsequent approaches to predictive 
control, particularly from the standpoint of industrial process control, were referred 
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to as Model Predictive Control [RRTP78] and Dynamic Matrix Control, [CR80]. See 
also the survey [GPM89]. 

SIORHR and SIORHC were introduced in [MLZ90], [MZ92] and, independently, 
[CS91]. GPC was introduced in [CTM85] and discussed in [CMT87a], [CMT87b] 
and [CM89]. For continuous time GPC see [DG91] and [DG92]. MKI and TCI 
are related to the self-tuning algorithms first reported in [ML89] , and, respectively, 
[Mos83]. For other contributions, see also [Pet84], [Yds84], [dKvC85], [LM85], 
[TC88], [RT90], and [Soe92]. 
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CHAPTER 6 

RECURSIVE STATE FILTERING 
AND SYSTEM 
IDENTIFICATION 



This chapter addresses problems of state and system parameter estimation. Our 
interest is mainly directed to solutions usable in real-time, viz. recursive estimation 
algorithms. They are introduced by initially considering a simple but pervasive 
estimation problem consisting of solving a system of linear equations in an unknown 
time-invariant vector. At each time-step a number of new equations becomes 
available and the estimate is required to be recursively updated. 

In Sect. 1 this problem is formulated as an indirect sensing measurement prob- 
lem, and solved via an orthogonalization method. When the unknown coincides 
with the state of a stochastic linear dynamic system we get a Kalman filtering prob- 
lem of which in Sect. 2 we derive in detail the Riccati-based solution and its duality 
with the LQR problem of Chapter 2. We also briefly describe the polynomial equa- 
tions for the steady-state Kalman filter as well as how the so called innovations 
representation originates. In Sect. 3 we consider several system identification algo- 
rithms. In this respect, to break the ice a simple deterministic system parameter 
estimation algorithm is derived by direct use of the indirect sensing measurement 
problem solution. This simple algorithm is next modified so as to take into account 
several impairments due to disturbances. In this way RLS, RELS, RML, and the 
Stochastic Gradient algorithm are introduced and related to algorithms derived 
systematically via the Prediction Error Method. In Sect. 4 convergence analysis of 
the above algorithms is considered. 



6.1 Indirect Sensing Measurement Problems 

We begin with addressing problems of filtering and system parameter estimation 
within a common framework. The distinct elementary ingredients are: an unknown 
w; a set of time-indexed stimuli p t by which w can be probed; a set accessible 
reactions m t of w to p t . The cause-effect correspondence between p t and m t via 
the unknown w is linear and of the simplest nontrivial form. Loosely stated, the 
problem is to find, at every t, an approximation to the unknown w based on the 
knowledge of present and past stimuli and corresponding reactions. This simple 



129 



130 



Recursive State Filtering and System Identification 




scheme encompasses seemingly different applications, ranging from problems of 
system parameter estimation to Kalman filtering. 

In order to accommodate different applications in a single mathematical frame- 
work, an abstract setting has to be used. Let H be a vector space, not necessarily 
of finite dimension, equipped with an inner product (•,•). We say that an ordered 
pair 

(p,m)eWxR (6.1-1) 

is an indirect-sensing linear measurement (ISLM), or simply a measurement, on an 
unknown vector w € H if m equals the value taken on by the inner product (w, p) 

m=(w,p) (6-1-2) 

In such a case, we call m the outcome and p the measurement representer. 1 It is 
assumed that a sequence of integer-indexed measurements 

( Pk ,m k ) , fce^i :={1,2,---} (6.1-3) 

is available. Let r := {1,2, ■■■,r}. The sequence £ r made up of the initial r 
measurements 

S r :={(p k ,m k ) , ker] (6.1-4) 

will be referred to as the experiment up to the integer r. We say that a vector 
v € Ti interpolates £ r if (v, p k ) = m k , k € r. 

The ISLM problem is to find a recursive formula for 

w\ r := the minimum-norm vector in Ti interpolating £ r (6.1-5) 

w\ r will be hereafter referred to as the ISLM estimate of w based on £ r . The norm 
alluded to in (5) is the one induced by (•, •), viz. \\w\\ := +^(w,w). 

Fig. 1 depicts the geometry of the problem that has been set up. Though the 
vector w is unknown, given p k and m k = (w, p k ), k e r, we can find w\ r as follows. 
Let [p r ] be the linear manifold generated by p r := {pk} r k= i 

[p r ] := Span{p fe }^ =1 

1 ~We recall that, if H is a Hilbert space, every continuous linear functional on H has the form 
(•, p) and p is called the functional representer [Lue69] . This accounts for the adopted terminology. 
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Then w\ r equals the orthogonal projection in H of the unknown w onto [//]. Ac- 
cording to the Orthogonal Projection Theorem [Lue69] , this can be found by using 
the following conditions 



(i) w ]r e[p r ], i.e. w\ r = ^a k p k: a k £ R 

fe=i 

(ii) w\ r := w — w\ r -L [p r ] . 
Combining (i) and (ii), we get 

r 

^2a k (pk,Pi) = (to, Pi) = m l , ier (6.1-6) 
fe=i 

Note that w\ r minimizes the norm of w\ r := w — w\ r among all vectors belonging 
to [p r ] 

w\ r = arg min {llw — w|| 2 } (6-1-7) 
ve[ P r ] 

Then, the ISLM estimate w\ r is the same as the minimum-norm error estimate of 
w based linearly on p r . 

Eq. (6) is a system of r linear equations in the r unknowns {ce k } k=1 . These 
equations are known as the normal equations. A system of normal equations can 
be set up and solved in order to find the minimum norm solution to an underde- 
termined system of linear equations, viz. a system where dim[p r ] < diraw. The 
direct solution of (6) is impractical for two main reasons: first, the number of the 
equations grows with r, and, second, the solution at the integer r does not explic- 
itly use the one at the integer r — 1 . The recursive solution that we intend to find 
circumvents these difficulties. 

The following examples show that the simple abstract framework which has 
been set up encompasses seemingly different applications of interest. 

Example 6.1-1 (Deterministic system parameter estimation) Consider the SISO system 

A(d)y(k) = B(d)u(k) \ 

A(d) := 1 + aid H Va ncl d n " \ (6.1-8) 

B(d):= fei(i + --- + 6„ b d"f J 

k = 1, 2, • • • t. This can be rewritten as 

y(k) = ip'(k - 1)0 (6.1-9) 



-■n-bj 

0:=[ai •■■ «„, 6i ■■■ b„ b ]' eU n( < (6.1-11) 
with ng := n a + ni,. If H equals the Euclidean space of vectors in R"« with inner product 
(w,p k ) = p' k w and we set 

w:=e Pk :=tp(k-1) m k :=y(k) (6.1-12) 

the ISLM problem amounts to finding a recursive formula for updating the minimum-norm system 
parameter vector 8 interpolating the I/O data up to time t. 

Example 6.1-2 (Linear MMSE estimation) Consider (Cf. Appendix D) a real— valued random 
variable v defined on an underlying probability space (Q,^, P) of elementary events uj £ Q, with 
a cr-algebra T of subsets of U, and probability measure P. We remind that a random variable v 
is an ^-measurable function v : f2 — » R. In particular, we are interested in the set of all random 
variables with finite second moment 



£{v 2 } := [ v 2 (u))Y(duj) < « 
Jn 
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where £ denotes expectation. This set can be made a vector space over the real field via the 
usual operations of pointwise sum of functions and multiplication of functions by real numbers. 
Further, 

(u,v) := £{uv} (6.1-13) 

satisfies all the axioms of an inner product [Lue69]. This vector space equipped with the inner 
product (13) will be denoted by L 2 (ft, .F, P). Setting H = L 2 (fl,J r ,'P) in the ISLM problem, 
the outcome (2) becomes the cross-correlation m = £{wp}, and the experiment (4) consists of r 
ordered pairs made up by the random variables p^ and the reals = £{wp^}. Here the ISLM 
estimate equals 

w\ r = arg min {£ {(w — f) 2 }} (6.1-14) 

ve[p r ] 



If w and pfc have both zero mean, £{w} = £{pk} = 0, w\ r coincides with the linear minimum 
mean-square error (MMSE) estimate of w based on the observations p r . 

For some applications we need to consider a generalization of the above setting. 
There are, in fact, cases in which w and p k have a number of components in H. 
Then, in general, 



w 
Pk 



[ w 1 ■■■ w n ]' eff 
[pI ■ ■ pI]' ^n p 



(2) takes the form of an outer product matrix 

m = {( w ,p)} '■= 



nxp 



(6.1-15) 
(6.1-16) 



(6.1-17) 



(w n ,p 1 ) ■■■ ( W P,pP) 
and the minimum-norm ISLM problem is then to find a recursive formula for 

(6.1-18) 



w\ r := 



where, for each i G n, 



w^ r := the minimum-norm vector in Ti interpolating 
£ r = {(p k ,{(w\ Pk )}),ker} 



(6.1-19) 



A remark here is in order. In general, w^ r cannot be found by just considering the 
"reduced" experiment 



j e p, k e r| 



and disregarding the remaining experiments which pertain to the other n— 1 vectors 
listed in (15). In fact, since the p k s may depend on w, and consequently w* r and 
wf, s, may turn out to be interdependent, it is not possible to solve the problem 

of finding wj r separately from that of wf. The Kalman filtering problem studied 
in the next section is an important example of such a situation. 

Problem 6.1-1 Consider the outer product matrix {(•,•}} defined in (17). Show that: 

i. {(u,v)} = {(v,u)}', u G H m , v e H n ; 

ii. {(Mu,v)} = M{(u,v)} for every matrix M e R pxm . 
Consequently, {(u,Mv)} = {(u, v)} M', for every matrix M e R px ™. 
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The geometric interpretation of Fig. 1 carries over to the general case (15)— (19). 
To this end, it is enough to set 

[p r ] := Span{p fe ,/c e r} = Span j^, j £ p, k £ r } (6.1-20) 

Thus, W| r coincides with the orthogonal projection in H of the unknown w % onto 
[P r ] 



w^ r = Projec 

For the sake of brevity, setting w)| r = 
simply write 

wi r = Projec 



w l | p r 



(6.1-21a) 
instead of (21a) we shall 



w | p 



Further, w^ r is uniquely specified by the two requirements 



n. 

where 



w\ r G [p r ] , i € n 
{{w\r,Pk)} = O nxp , k £ r 



W\ r '■— W — Ml, 



(6.1-21b) 

(6.1-22a) 
(6.1-22b) 

(6.1-23) 



Problem 6.1-2 Let u g 7i m and v g H n . Let Projec [u \ v] denote the componentwise orthog- 
onal projection in H of u onto Span{t)}. Then show that 

Projec [u | v] = {{u, v)} {{v, v)}' 1 v (6.1-24) 

Solution by Innovations 

Let us now construct from the representers {pk,k £ TL\} an orthonormal sequence 
{vk,k £%i} in W by the Gram- Schmidt orthogonalization procedure [Lue69]. 
Here by orthonormality we mean 



{{vr, v k )} = I p S r ,k Vr, k £7L\ 
where 8 rt k denotes the Kronecker symbol. Accordingly, 

r-l 



e r := p r - y {{pr,Vk)}vk 



k=l 

f L r 1//2 e r , L r nonsingular 

\ Ot-cp , otherwise 

where L r equals the symmetric positive definite matrix 

L r := {(e r ,e r )} £ R pxp 



(6.1-25) 

(6.1-26) 
(6.1-27) 

(6.1-28) 



fT/2 ._ (^L 1 / 2 ^ and l}J 2 any p x p matrix such that L r ^ 2 L^ 2 = L r - Since the 
first of (26) can be rewritten as 



e r = p r - p r \r-i , p^-i : = Projec [pi I p 



r-l! 



(6.1-29) 
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every e r is obtained by subtracting from p r its ISLM one-step-ahead prediction, 
i. e. its ISLM estimate based on the experiment 

{(p k ,m k = {(p r ,Pk)}) , k € r - 1 } 

up to the immediate past. For this reason, hereafter {e r , r e TL{\ will be called the 
sequence of innovations of {p r ,r e 7Li\ and {v r ,r e 2^} that of the normalized 
innovations. 

The introduction of the innovations leads to easily finding an updating equation 
for the orthogonal projector 



S r (-) := Projec [■\p r ]:'H-> [p r ] 



(6.1-30) 



which will be referred to as the estimator at the step r associated to the given 
experiment. Indeed, since [u r ] := Span{^ r } is the orthogonal complement of [p r_1 ] 
in [p r ], we have that 

„r-l 



Therefore, setting So(') : = On, we have 

S r (-) = 5 r _i(-) + {(-,^r)}^r 

= 5 r _ 1 (-) + {(-,e r )}i- 1 e r 

The n-order extension of 5 r (-) is defined as follows 

5 r " : [ v 1 ••• o n ]'H»[ S^t; 1 ) 
: H n ^[pT 



S r (v n ) 



(6.1-31) 
(6.1-32) 

(6.1-33) 



The innovator I r {-) at the step r is defined as the orthogonal projector mapping 
H onto the orthogonal complement [/O r_1 ] of [p r_1 ] 



2 r (-)=/(-)-Sr-i(-):W^ [P r ~ 1 ] ± 



(6.1-34) 



where /(•) is the identity transformation in H. Its n-order extension is defined as 
follows 

X™ : [ v 1 ••• v n ]'^[l r {v 1 ) ••• 2 r (v n ) Y (6.1-35) 
The following identity justifies the terminology used for I r (-) 



p r - p r | r _! = [P - SJLJ (p r ) 
Z?(Pr) 



(6.1-36) 



Proposition 6.1-1. TTie innovator and, respectively, the estimator satisfy the fol- 
lowing recursions 

Z?(-) = X^^O-K-,^!^,-!))}^^^-!) (6.1-37) 
S?(0 - ^_ 1 (-) + {(-,X, p (^)>}i r - 1 ^(p I .) (6.1-38) 



where = I(-), Sq(-) = On, and 

L r = {( Pr ,l?(Pr))} 



(6.1-39) 
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Proof Eqs. (37) and (38) follows at once from (32)-(36). Eq. (39) is proved by using self- 
adjointness and idempotency of X r (-). In fact the latter being an orthogonal projector is idempo- 
tent, i.e. Zj?(.) = T r {-) and self-adjoint, i.e. (T r {u),v) = (u,X r (v)) for all u,v 6H. Hence 

L r = {(IP( Pr ),e r )} [(36)] 

= {(p r ,Ir ( e r))} [self-adjointness] 

= {(p r ,(l?) 2 (Pr)} [(36)] 

= {(pr,I% (pr)} [idempotency] 

The desired recursive formula for w\ r can now be obtained by simply applying (38) 
to the unknown w. Indeed, 

W\ r = S?(W) = SlUH + {M (Pr))} Lr' (pr) 

= W^ + Hw^iPr^L-^iPr) (6.1-40) 

Further, self-adjointness of 1 yields 

{(w,I?(p r ))} = {(I?(w), Pr )} (6.1-41) 

= {(tu-t&| r _i,p r )} [(34)] 
= rn r \r-i 

where 

m r i r _i := m r — m r \ r _i (6.1-42) 

is the one-step-ahead prediction error on the measurement outcome at the step r 
and 

m r | r _i := {(tD| r _i,p r )} (6.1-43) 

Theorem 6.1-1. The ISLM estimate w\ r of w e H n based on the experiment £ r 
satisfies the following recursion 



w 



«)| r _i + rh r \ r _\L~ l TP (p r ) (6.1-44) 



where T t (-) is the innovator satisfying (37), L r and rh r \ r _ 1 are given by (39), re- 
spectively, (4.2), and w\q = On™- 

The results of Theorem 1 are schematically depicted in Fig. 2. 
We consider next the matrix {(w)i r , W| r )} which quantifies the estimation errors 
at the step r 

M r+1 := {(w lr ,w lr )} = {(l? +1 (w),l? +1 (w))} [(34)] 

= {(T™ +1 (w),w)} [self-adjointness & idempotency] 

= M r -{( W ,2?(p r ))}L r - 1 {(^(p r ), W }} (6.1-45) 

with Mi — {(w,w)}. In linear algebra M r is known [Lue69] as the Gram matrix of 
the vectors collected in wi r . Its recursive formula (45) is reminiscent of a Riccati 
difference equation. In fact, as will be shortly shown, it becomes a Riccati equation 
once suitable auxiliary assumptions on the structure of the problem at hand are 
made. 

Main points of the section On-line linear MMSE estimation of a random 
variable from time indexed observations as well as deterministic system parameter 
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w = 



=? 



«V ^} 



m r\r-l 



m r | r _i 



lW| r _i 



Unit Delay 



m r | r _iL r 1 e r 



Recursions (37) & (39) 



Figure 6.1-2 : Block diagram view of algorithm (37)-(44) for computing recursively 
the ISLM estimate w\ r . 



estimation from I/O data can be both seen as particular versions of the ISLM 
problem. The latter consists of finding recursively the minimum norm solution to 
an underdetermined system of linear equations, the number of equations in the 
system growing by n at each time step. 

The ISLM problem can be recursively solved via the Gram-Schmidt orthogonal- 
ization procedure by constructing the innovations of the measurement representers. 



6.2 Kalman Filtering 
6.2.1 The Kalman Filter 

We apply (1-44) to the linear MMSE estimation setting of Example 1-2 generalized 
to the random vector case. In such a case 

m r | r _i = £{w ]r _ 1 p' r } (6-2-1) 

L r = £{ Pr e' r } (6.2-2) 

Therefore, (1-44) yields 

w\ r = u>| r _i + £ {w r -ip' r } (£ {p r e' r })~ 1 e r (6.2-3) 

This is an updating formula for the linear MMSE estimate yet at a quite unstruc- 
tured level. We next assume that the following relationship holds between w and 

Pr 

p r := z{t) = H{t)w + C(*) (6.2-4) 
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where 

t :=r + t -l e {t , to + !,-••}, (6.2-5) 

if(t) is a(pxn) matrix with real entries, and £(t) a vector-valued random sequence 
with zero mean, white, with covariance matrix *f>£ (t) 

£{C(t)} - O p £{C(*)C(t)} = *c(*)*t,r (6-2-6) 
and uncorrelated with to 

£{wC(t)} = O nxp (6.2-7) 



Then 



where 
Further, 



e(t) := e r = p r - p r | r _i (6.2-8) 

= z(t) - H(t)w(t - 1) 

= jy(*)«>(t — l) + C(*) 

tu(i) := W| t and i/5(t) := w — w(t) (6.2-9) 



L(t) := L r = H(t)M(t)H'(t) + tf c (i) (6.2-10) 

£{w(t - l)z'(i)} = M{t)H'{t) 

where M(t) := £{w(t — l)w'(t — 1)} takes here the form of following Riccati differ- 
ence equation (RDE) 

M(t+1) = 

= M(t)-{( W ,27(p r ))}L r - 1 {(27(p r ), W )} [(1-45)] 

= M(t)-M(f)i?'(t)[i?(t)M(i)i?'(t) + # c (t)l H{t)M{t) (6.2-11) 

with M(fo) = £{W}. Hence, (3) becomes 

w(t) = w(t-l)+ (6.2-12) 

M{t)H'{t) \H{t)M{t)H'{t) + * c (i)l 1 \z{t) - H(t)w(t - 1) 

This result can be extended in a straightforward way to cover the linear MMSE 
estimation problem of the state 

w := x(t) (6.2-13) 
of a dynamic system evolving in accordance with the following equation 

x{t + 1) = $(t)z(t) + £(t) (6.2-14) 

where: t = to, to + 1, ■ ■ S 3>(t) € R" xn ; £(f) is a vector-valued random sequence 
with zero mean, white, with covariance matrix ^(i) 

£{£(t)} = O p £ {£(*)£' (r)} - * € (t)J tiT (6.2-15) 

and uncorrelated with £(r) 

£{£(t)CM} = , Vt,r>t (6.2-16) 

Finally the initial state a; (to) is a random vector with zero mean and uncorrelated 
both with £(t) and C(t) 

£{a;(to)} = £ {x(t )C'(t)} = £{x(t )C'(i)} = (6.2-17) 
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Theorem 6.2-1 (Kalman Filter I). Consider the linear MMSE estimate x(t \ t) 
of the state vector x(t) of the dynamic system (H)-(ll) based on the observations 
z l 

z* := {z(r)Y 

y* J T=t 

z(t) = H(t)x(t) + C(t) (6.2-18) 

with C satisfying (6) and (7). Then, provided that the matrix L(t) in (10) is 
nonsingular Vr = to,---,t, x(t \ t) is given in recursive form by the estimate 
update 

x(t\t)=x(t\t-\)+K(t)e(t) (6.2-19) 
and the time prediction update 

x{t+l\t) = <f>{t)x(t\t) (6.2-20) 

x(t | t - 1) = O n (6.2-21) 

where 

K{t) := U^H'^L-^t) (6.2-22a) 

K{t) := <S>(t)U(t)H'(t)L- 1 (t) (6.2-22b) 

is the Kalman gain 

e(t) = z(t) - H(t)x{t | t - 1) (6.2-23) 
is the innovation of the observation process z(t), 

L(t) = H(t)U(t)H'(t) + W c (i) (6.2-24) 

II(i) equals the covariance matrix of the one-step-ahead state prediction error x(t \ 
t-l):= x(t) -x(t\t-l) 

11(f) = £{Z{t | t - l)x'(t \t-l)} (6.2-25) 

and satisfies the following RDE 

n(t + i) = $(t)n(t)$'(t) - 

${t)n{t)H\t)L-\t)H{t)TL{t)& {t) + tf$(t) (6.2-26) 
= $(t)n(t)$'(t) - K(t)L(t)K'(t) + (6.2-27) 

= [$(t) - #(*)#(*)]' n(t) [$(*) - # (f)ff (t)] + 

i<T(t)*C + (6.2-28) 

tu«£/i II(£o) = £{x(io | to - l)^'(*o I — 1)}) fie a priori covariance ofx(to). Further 
the linear MMSE one-step-ahead prediction of the state is given by 

x(t+l\t) = <f>(t)x{t | t - 1) + K(t)e(t) (6.2-29) 

Proof Using (13) in (12) we get 

x(t\t) = x{t\t-l)+M(t\t)H'(t)L- 1 (t)[z(t)-H{t)x(t\t-l)\ 
L(t) = H(t)M(t | t)ff'(t) + * c (t) 
M(t\t) := S{x(t | t- l)x'{t | t- 1)} 
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where 

x(t | t) := x(t) - x(t | r) 

Then (19) and (22) are proven if we can show that Tl(t) := M(t \ t) satisfies (26). Now, by (16), 
(20) holds, and x{t + 1 | t) = <S>(t)x(t | t) + £(t). Consequently, 

M(t + l\t + l) = *(t)iW(t | t + l)*'(t) + * 5 (t) (6.2-30) 
M(t | t + 1) = £{£(« | t)x'(t | t)} 

Further, (11) yields 

M(t | t + 1) = M(t | t) - M(t | t)H'(t)L- 1 (t)H(t)M(t | t) (6.2-31) 

Finally, setting 

II(t) := M(t | t) (6.2-32) 
and combining (30) and (31) we get (26). Eq. (29) is obtained by substituting (19) into (20). 

We intend now to extend the Kalman filter to cover the case of nonzero means. To 
this end, consider Example 1-2 where now 

= ? } (6 . 2 _ 33) 

£{pk} = Pk P k :=Pk-pk J 

Here w and p k are centered random vectors, viz. zero-mean random vectors. Let, 
similarly to (1-20), 

[p r ] ■= Span|p fe ,p fe ,/c e rj 
= Span | p k , p kl k e rj 

= Span|p fe ,I, fc e rj (6.2-34) 
where I denotes the random variable equal to one. Then, we refer to 

w\ r — arg min {£ [(to — w) 2 ] } (6.2-35) 

as the ajffine MMSE estimate of w based on p r or the linear MMSE estimate of to 
based on {p r , I}. 

Problem 6.2-1 Show that W| r in (35) equals 

tt)| r = to + Projcc [w | p r ] (6.2-36) 

i.e. the sum of its a priori mean with the linear MMSE of the centered random vector w based 
on the centered observations p r . 



We use the above result so as to find the afiine MMSE estimate of the state x(t) 
of the system 

x(t+l) = 1>(t)x(t) + G u (t)u(t) + £{t) \ , , 

z(t) = H(t)x(t)+((t) / (b - Z - 6 '> 

based on z', the observations up to time t 

z* := { z(t), z(i-l), z(t ) } 

It is assumed that 

£{x(t )} - x (6.2-38) 

£{u(t)} = u(t) g R m (6.2-39) 
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Then, setting x(t) := £{x(t)}, we have from (37) 

x(t+i) = mm + Gu(mt) i 

x(t ) = x J 

Further, letting x(t) := x(t) — x(t), u(t) := u(t) — u(t), and z(t) := z(t) — H(t)x(t), 
we find 

x(t+l) = Q(t)x(t) + G u (t)u(t) + f(t) ) 

z(t) = H(t)x(t)+((t) \ (6.2-41) 

£{x(t )} = , £{x(t )x'(t )} = n(t ) J 

Before proceeding any further, let us comment on the decomposition x(t) = x(t) + 
x(t). First, note that (40) is a predictable system in that x(t) can be precomputed. 
On the opposite, (41) is unpredictable owing to the uncertainty on its initial state 
x(to), and the presence of the inaccessible disturbance £. 

Let us assume that u(t) equals a vector-valued linear function of z(t) up to 
time t, viz. 

u(t)=£(t,z t ) (6.2-42) 
The linear MMSE estimate x(t | t) of x(t) based on equals 

x(t | t) = x(t\t - 1)+ 

n(t)i7'(t)i- 1 (t) [z(t) - H(t)x(t \t-l)\ } (6.2-43) 
x(t+l\ t) = $(t)x(t | t) + G u (t)u(t) 



If we add x(t) to the first of (43), and x(t + 1) to the second of (43), and set 

(6.2-44) 



x(t | t) := x(t | t) + x(t) 
x(t + l\t) := x(t + l\t) + x(t + 1) 

we find, according to (36), the desired result. 

Theorem 6.2-2 (Kalman Filter II). Consider the state vector x(t) of the dy- 
namic system (37), where all the centered random vectors satisfy (6)-(7) and (14)- 
(17), and its linear MMSE estimate x(t | t) based on the observations Let 
(38) hold and u(t) be linear in {z l , 1} 



u(t) = /(*,**, I) 
f(t, •, •) linear 



(6.2-45) 



Then, under the same assumptions as in Theorem 1, x(t \ t) satisfies the estimate 
update: 

x(t \ t)=x(t\t-l)+ K(t)e(t) (6.2-46) 
and the time prediction update: 

x{t+l\t) = <f>{t)x{t | t) + G u (t)u{t) (6.2-47) 

where x(to \ to — 1) = £{x(to)} and (22)- (28) hold true. Furthermore, 

x(t + 1 | t) = ®(t)x(t | t - 1) + G u {t)u{t) + K{t)e{t) (6.2-48) 

Proof It sufficies to note that 

zft) - H(t)x(t | t - 1) = z(t) - H(t) [x(t) +x(t\t- 1)] = z(t) - H(t)x(t | t - 1). 
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u(t) 



G u (t) 



G u {t) 




x(t+ 1) 



z(t) 



*(*) 



e(t) 



(x(t+l|t) 



x(t\t-l) 



H(t) 



z(t) 



e(t) 



e(t) 



a;(t|*) 



x(t+l\t) 



Kalman Filter 



Figure 6.2-1: Illustration of the Kalman filter. 



Terminology For the sake of simplicity, from now on we shall refer to the affine 
MMSE estimate (35) as the linear MMSE estimate w\ r of w based on p r by adhering 
to the convention to include in p r the random variable p := I. 

Eq. (46) and (47), or (48), along with (22)-(28), are called the Kalman Filter 
(KF). Fig. 1 shows a diagrammatic representation of the KF. The KF outputs 
indicated in Fig. 1 are the innovations process e(t), the state filtered estimate x{t \ t) 
and the one-step-ahead prediction estimate x(t +1 | t). The first, which in view 
of (23) is a by-product of x(t \ t — 1), is indicated to point out that the KF can be 
also considered as an innovations generator. 

Problem 6.2-2 Show that for every integer k S TL+ the linear MMSE fc— step-ahead prediction 
of x(t + fc) based on jz', 1} is given by 

(6.2-49) 
(6.2-50) 



x(t + k\ t) = S(t + k,t)x(t | t) + <p(t + k,t,O x ,u [t+ktt) ) 



and 



z(t + k\t) = H(t + k)x(t + k\ t) 
where x(t \ t) is as in Theorem 2 and the same notations as in Problem 2.2-1 are used. 



Problem 6.2-3 (Duality between KF and LQR) Given the dynamic linear system E = (<£(<), G(t), H(i)), 
define S* = (**(t) := *'(t),G*(t) := H'(t),H*(t) := G'(t)) as the dual system of S. Consider 
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next the LQOR problem of Chapter 2 for M(t) = and its solution as given by Theorem 2.3-1. 
Consider ^^(t) in (15). Factorize ^(t) as G(t)G'(t) with G(t) a full column-rank matrix. Show 
that, if \Pf (t) = ip u (t), the RDE (26) for the KF and the system E is the same as the one of 
(2.3-3) for LQOR with ip v (t) = I, and the plant E*, the only difference being that while the first 
is updated forward in time, the latter is a backward difference equation. Show that under the 
above duality conditions the state— transition matrix 4>*(t) + G* (t)F(t) of the closed— loop LQOR 
system is the transposed of the state— transition matrix <J>(t) — K(t)H(t) of the KF provided that 
the two RDE's (2.3-3) and respectively (26) are iterated, backward and respectively forward, by 
the same number of steps starting from a common initial nonncgativc definite matrix flo = V(T). 



6.2.2 Steady-State Kalman Filtering 

Consider the time invariant KF problem, viz.: <I>(i) = <&; ^(i) = 'J/j = GG' with 
G of full column-rank; H(t) = H; and ^(t) = ^c- We can use duality between 
KF and LQR to adapt Theorem 2.4-5 to the present context. 



Theorem 6.2-3. (Steady— State KF) Consider the time-invariant KF problem 
and the related matrix sequence {n(i)}^ generated via the Riccati iterations (26)- 
(28) initialized from any 11(0) = LT'(O) > 0. Assume that * c = > 0. Then there 
exists 

n = lim n(t) (6.2-51) 



t— >oo 



such that 



n = lim £{x{t I t- \)x'{t I t- 1)} (6.2-52) 

t— »oo 

= lim mm £ { [x(t) - v] [x(t) - v]'} 

t-»00 t;e[zt-l,H] 

Further the limiting filter 

x(t\t) = x(t I t- l)+UH'L- 1 e(t) (6.2-53) 

x(t+l\t) = <f>x{t I t) + G u (t)u(t) (6.2-54) 



with 



or 



L = HUH' + * c (6.2-55) 



x(t+l\t) = $x{t I t - 1) + G u (t)u(t) + Ke(t) (6.2-56) 
e(t) = z(t) - Hx(t \t-l) (6.2-57) 

wi^/i K the steady-state Kalman gain 

K = ^UH'L- 1 (6.2-58) 

is asymptotically stable, i.e. & — KH is a stability matrix, if and only if the system 
($, G, H) generating the observations z(t) is stabilizable and detectable. Further, 
under such conditions the matrix II in (51) coincides with the unique symmetric 
nonnegative definite solution of the following algebraic Riccati equation (ARE) 

n = $n$' - $Lur (huh' + * c ) _1 hii& + * e (6.2-59) 



Terminology We shall refer to the conditions of Theorem 3 along with the proper- 
ties of stabilizability and detectability of ($, G, H) as the standard case of steady- 
state KF. 
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6.2.3 Correlated Disturbances 

We consider again the system 



x(t+l) = $(t)x(t) + G u (t)u(t) + f(t) 
z(t) = H(t)x{t) + C(t) 



(6.2-60) 



along with the usual assumptions. Instead of (16), it is assumed hereafter that 

£{mC(r)} = S(t)5 t ,r (6.2-61) 

with S(t) € R" xp possibly a nonzero matrix. In order to extend Theorem 1 to the 
present case, it is convenient to introduce the linear MMSE estimate t-(t) of £(t) 
based on £* 



Set 



:= ProjecK(t) | C*] 

= ProjecK(t) |CW] [(61)] 

= sm-\t)at) 



(6.2-62a) 



[(1-24)] 



(6.2-62b) 



Problem 6.2-4 Show that £(t) is uncorrclatcd with C( r ) 

£{i(t)C(T)} =O nxp , Vt, r > to (6.2-62c) 
and white with covariance matrix 

^-(t) = y £ (t) - S(t)^ ( (t)S'(t) (6.2-62d) 

Using (62), the first of (60) can be rewritten as follows 

x(t+l) = $(i)z(t) + G u (t)u(t) + |(t) + f(i) (6.2-63) 
= &(t)x(t) + G u {t)u{t) + Sit)^ 1 ® [z(t) - H(t)x(tj\ + i(t) 

= - S(t)^\t)H(t)\ x(t) + 

G u (t)u(t) + S(t)^\t)z(t) + f(i) 

This system is now in the standard form (40) since £(t) and are both zero mean, 
white and mutually uncorrelated, and the extra input G u (t)u(t) := S{t)^~^ {t)z{t) € 
Span {;?*}. Thus, Theorem 2 can be used to get an estimate update identical with 
(46) and the following time prediction update 



x{t + 1 | t) 



$(t) - S(i)*^(i)/f(t)J x(t | t) + 



(6.2-64a) 



Problem 6.2-5 Prove that the state prediction error covariance as defined as in (24), in 

the present case satisfies the recursion 

n(t + l) = S(t)II(t)S'(t) - (6.2-64b) 

r i ' 

*(t)n(t)H'(t) + S(t) L-^t) <S>(t)U(t)H'(t) + S(t) +* 5 (t) 
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Problem 6.2-6 Prove that the recursive equation for x(t \ t — 1) can be rearranged as follows 
x(t + l\t) = *(t)x(i | t - 1) + G u (t)u(t) + K(t)[z{t) - H(t)x(t | t - 1)] (6.2-64c) 
where K(t) is the Kalman gain 



K(t) 



*(t)n(t)H'(t) + s(t) 



(6.2-64d) 



Problem 6.2-7 Consider the time— invariant linear system 



x(t + l) 
*(t) 



*x(t) + G„i*(t) + GC(t) \ 
Hx(t) + C(t) J 



where the possibly non Gaussian process £ satisfies the following martingale difference properties 
(Cf. Appendix D.5 and Sect. 7.2 for the definition of {Fk}) 

£{C(t) | T t -i} = O p a.s. 

oo > £ {C(*)C'(*) I Ft-i} = * c > a.s. 

and 

£{C(*o)x'(t )} = O pxm 

Let <E> — G-ff be a stability matrix and u(t) satisfy (45). Let x(t \ k) := £{x(t) \ z k }. Then, show 
that limt^oo 11(f) = 0, as t — > co x(t \ t) —* x(t \ t — 1) and x(t \ t — 1) satisfies the recursions 

x(t + l\t) = $x(t | t - 1) + G u u(t) + GC,(t) \ 
C{t) = z(t) - Hx(t | t - 1) J 

6.2.4 Distributional Interpretation of the Kalman Filter 

It is known [Cai88] that if u and v are two jointly Gaussian random vectors with 
zero mean, the orthogonal projection Projec[w | [v]] coincides with the conditional 
expectation £{u \ v} which, in turn, yields the unconstrained, viz. linear or non- 
linear, MMSE estimate of u based on v. Then, it follows that if x(to), £(r) and 
£(t) are jointly Gaussian, x(t \ t — 1) and II(i) generated by the KF coincide with 
the conditional expectation £ {x(t) \ z t_1 } and, respectively, conditional covari- 
ance Cov (x(t) \ z'" 1 ) . Further, since under the stated assumptions the conditional 
probability distribution P (x(t) \ z'^ 1 ) of x(t) given z t_1 is Gaussian, we have 



P (x(t) 



N 



(a;(t|t-l),n(t)) 



(6.2-65) 



where N(x, II) denotes the Gaussian or Normal probability distribution with mean 
x and covariance II, viz., assuming II nonsingular and hence considering the prob- 
ability density function n(x,H) corresponding to iV(a;,n), 



n{x,U)= (27r) n dctn 



-1/2 



exp 



-l\\*fn-> 



The operation of the KF under such hypotheses is referred to as the distributional 
version, or interpretation, of the Kalman filter. 

An interesting and useful extension of the distributional KF is the conditionally 
Gaussian Kalman filter wherein all the deterministic quantities at the time t in the 
filter derivation are known once a realization of z f is given. E.g., (46)-(48) are still 
valid in case u{t) = f(t, z l ) with f{t, ■) possibly nonlinear. This is of interest in 
problems where the input u is generated by a causal feedback. 



Sect. 6.2 Kalman Filtering 



145 





Physical 




System 




System 




c 


z 


e 


z 


System 




(66a) 




(66b) 













u 



Figure 6.2-2: Illustration of the KF as an innovations generator. The third system 
recovers z from its innovations e. 



Fact 6.2-1 (Kalman Filter III). Suppose that in the dynamic system (37) x(to), 
£ and ( are jointly Gaussian. Let all the centered random vectors satisfy (6)-(7) 
and (14)~(17). Assume that 

u(t) = .f(t,z t ) 

with f(t, •) possibly nonlinear. Then, the conditional expectation £{x(t) \ z t ^ 1 } of 
the state x{t) given z l ~ l coincides with the vector x(t \ t — 1) generated by the KF 
equations of Theorem 2 or their extension (64-). 

Recall (Appendix D) that the conditional mean £{x(t) | z* -1 } is the MMSE 
estimator of x(t) based on z t_1 amongst all possible linear and nonlinear estimators 
of x{t). 



6.2.5 Innovations Representation 

As shown in Fig. 1, one of the KF outputs is the innovations process e. In this 
respect, the KF plays the role of a whitening filter, its input process z being trans- 
formed into the white innovations process e: 

x(t+l\t) = - K(t)H(t)) x{t\t- 1)+ ] 

G u {t)u{t) + K{t)z{t) \ (6.2-66a) 

e{t) = -H{t)x{t | t - 1) + z{t) J 

On the other hand, as was noticed in [Kai68], the observation process z can be 
recovered via the "inverse" of (66a) 

x(t + l\t) = *(t)x(t \t-l) + G u (t)u(t)+K(t)e(t) \ rfi9fifiW 
z(t) = H(t)x(t | t - 1) + e(t) J ^-° DD J 

The situation is depicted in Fig. 2 where it is also shown that z is generated 
by the dynamic system (37), labelled "physical system". Eq. (66b) is called the 
innovations representation of the process z. 

In the time-invariant case, under the validity conditions of Theorem 3 and 
in steady-state, we can compute the transfer matrices associated with (66a) and 
(66b). We find 

[ H eu (d) H ez (d) ] = 

= -#(/„- d(*-#ff)) -1 [ dG u dK] + [O pxm I p ] (6.2-67a) 
= C -1 (d) [ B{d) A(d) ] (6.2-67b) 
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[ H zu (d) H ze (d) ] = 

= ff(/„ -d$) 1 [ dG u dK ] + [ O pxm I p ] (6.2-67c) 
= A-\d) [ B(d) C(d) ] (6.2-67d) 

Problem 6.2-8 By using the Matrix Inversion Lemma (5.3-21), verify that in (67a) and (67c) 
Hze(d) = H ez (d). Further, check that H eu (d) = H ez (d)H zu (d), and H zu (d) = H ze (d)H eu (d). 

In (67b) C~ 1 (d) [ B(d) A(d) ] denotes a left coprime MFD of the transfer matrix 
in (67a). Then, it follows from Fact 3.1-1 that 

det C{d) | det C(0) • X <s> K ( d ) (6.2-68a) 
<$> K := $ - HK (6.2-68b) 

In the standard case of steady-state KF, the state-transition matrix <&k is asymp- 
totically stable. Hence, the p x p polynomial matrix C(d) is strictly Hurwitz. 
The discussion is summarized in the following theorem. 

Theorem 6.2-4. The KF (66a) causally transforms the observation process z into 
its innovations process e. This transformation admits a causal inverse (66b), called 
innovations representation of z. In the standard case of steady- state KF, (66a) 
becomes time-invariant and asymptotically stable, yielding a stationary innovations 
process with covariance H'TIH + ^ . 

Finally, the pxp polynomial matrix C(d) in the I/O innovations representation 
obtained from (67d) 

A{d)z{t) = B{d)u{t) + C(d)e(t) (6.2-69a) 
satisfies (68) and hence, in the standard case, C(d) is strictly Hurwitz. 

In the statistics and engineering literature the innovation representation (69) 
is called an ARM AX (Autoregressive Moving-Average with eXogenous inputs) or a 
CARMA (Controlled ARM A) model. ARMAX models have become widely known 
and exploited in time-series analysis, econometrics and engineering [BJ76], [Ast70]. 

The word exogenous has been adopted in time-series analysis and econometrics 
to describe any influence that originates outside the system. In control theory, 
however, the process u appearing in an ARMAX system is a control input that 
may be a function of past values of y and u. For this reason in such cases, an 
ARMAX system is more appropriately referred to as a CARMA model. 

Whenever C(d) = I p in (9a), the resulting representation 

A(d)z(t) = B{d)u{t) + e{t) (6.2-69b) 

is called an ARX (Autoregressive with eXogenous inputs) or a CAR (Controlled 
AR) model. 

6.2.6 Solution via Polynomial Equations 

The polynomial equation approach of Chapter 4 can be adapted mutatis mutan- 
dis to solve the steady-state KF problem. We give here the relevant polynomial 
equations without a detailed derivation. The reason is that the results that fol- 
low consist of the direct dual equations of the ones obtained in Chapter 4. The 
interested reader is referred to [CM92b] for a thorough discussion of the topic. 
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We consider the following time-invariant version of (37) 

x(t+l) = $x{t) + G u {t)u{t) + Gv{t) 
z(t) = Hx(t)+((t) 

where v and £ are zero mean, mutually uncorrelated processes with constant co- 
variance matrices 

£{v{t)v'{T)} = *J ttT £{((t)C(T)} = * c 6 ttT (6.2-70b) 

where 'J,, = > and Vf^ = ^ > 0. Define the following polynomial matrices 

A(d) := I n - d^> B(d) := dH (6.2-70c) 

Find a left coprime MDF A^ 1 (d)B 1 (d) of B(d)A~ 1 (d) 

A^ 1 (d)B 1 (d) = B(d)A- 1 (d) (6.2-70d) 

Note that B(d)A~ 1 (d) is a MFD of the transfer matrix from := Gv{t) and 
z{t). Next, find a p x p Hurwitz polynomial matrix C{d) solving the following left 
spectral factorization problem 

C(d)C*(d) = Ai(d)* c A*(d) + Bi(d)* c Bj(d) (6.2-71a) 

with * e := G*„G". Let 

gr : =max{aAi(d),aSi(d)} (6.2-71b) 

<9A(d) denoting the degree of the polynomial matrix A(d). Define 

C(d) := d q C*(d) ; A^d) := d q A{{d) ; B^d) := d q B{{d) (6.2-71c) 

Let the greatest common right divisors of A(d) and B(d) be strictly Hurwitz, i.e. 
(Cf. B-5) the pair ($, G) detectable. Then, from the dual of Lemma 4.4-1, it follows 
that there is a unique solution (X, Y, Z(d)) of the following system of bilateral 
Diophantine equations 

YE(d) + A(d)Z(d) = * e B! (6.2-72a) 
XE(d) - B(d)Z(d) = * c Ai (6.2-72b) 

with dZ(d) < dE(d). 

Problem 6.2-9 Show that a triplet (X, Y, Z(d)) is a solution of (72a) and (72b) if and only if 
it solves (72a) and 

Ai (d)X + Bi (d)Y = C{d) (6.2- 72c) 

The dual of Theorem 4.4-1 gives the polynomial solution of the steady-state KF 
problem. 

Theorem 6.2-5. Let ($, H) be a detectable pair, or equivalently, A(d) := I — d& 
and B(d) := dH have strictly Hurwitz gcrd's. Let (X,Y, Z(d)) be the minimum 
degree solution w.r.t. Z(d) of the bilateral Diophantine equations (72a) and (72b) 
[or (72a) and (72c)]. Then, the constant matrix 

K = YX- 1 (6.2-73) 



(6.2-70a) 
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makes $k — 3> — HK a stability matrix if and only if the spectral factor C(d) 
in (71a) is strictly Hurwitz. In such a case, (73) yields the Kalman gain of the 
steady-state KF. Further, if (<I>, H) is reconstructible, the d-characteristic polyno- 
mial Xi> K (d) equals 

, n detC(d) 

*'*W = 6*W) (6 ' 2 " 74) 

// is positive definite, C(d) is strictly Hurwitz if ($, G, if) is stabilizable and 
detectable. 

Finally, if($,H) is an observable pair, the matrix pair (X,Y) in (73) is the 
constant solution of the unilateral Diophantine equation (72c). 

The results dual to the ones of Sect. 4.5 which give the relationship between 
the polynomial and the Riccati-based solution of Theorem 3 arc listed below. 

n = $n$' - YY' + * e (6.2-75a) 

YY' = <MIff (* c + HUH')' 1 HIL& (6.2-75b) 

XX' = ^ C + HUH' (6.2-75c) 

YX' = &IIH' (6.2-75d) 

Z(d) = nSi(d) (6.2-75e) 



Main points of the section The Kalman filter of Theorem 2 or its extension 
(64) for correlated disturbances gives the recursive linear MMSE estimate of the 
state of a stochastic linear dynamic system based on noisy output observations. 
Under Gaussian regime, the Kalman filter yields the conditional distribution of the 
state given the observations. 

State-space and I/O innovations representations, such as ARMAX and ARX 
processes, result from Kalman filtering theory. 



6.3 System Parameter Estimation 

The theory of optimal filtering and control assumes the availability of a mathe- 
matical model capable of adequately describing the behaviour of the system under 
consideration. Such models can be obtained from the physical laws governing the 
system or by some form of data analysis. The latter approach, referred to as sys- 
tem identification is appropriate when the system is highly complex or imprecisely 
understood, but the behaviour of the relevant I/O variables can be adequately de- 
scribed by simple models. System identification should not be necessarily seen as 
an alternative to physical modelling in that it can be used to refine an incomplete 
model derived via the latter approach. 

The system identification methodology involves a number of steps like: 

i. Selection of a model set from which a model that adequately fits the experi- 
mental data has to be chosen; 

ii. Experiment design whereby the inputs to the unknown system are chosen and 
the measurements to be taken planned; 

iii. Model selection from the experimental data; 
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iv. Model validation where the selected model is accepted or rejected on the 
grounds of its adequacy to some specific task such as prediction or control 
system design. 

For these various aspects of system identification, we refer the reader to related 
specific standard textbooks, e.g. [Lju87] and [SS89]. 

In this section we limit our considerations to models consisting of linear time- 
invariant dynamic systems parameterized by a vector with real components. We 
focus the attention on how to suitably choose one model fitting the experimental 
data. This aspect of the system identification methodology is usually referred to as 
parameter estimation. This terminology is somewhat misleading in that it suggests 
the existence of a "true" parameter by which the system can be exactly represented 
in the model set. In fact, since in practice this is never achieved, the aim of system 
identification merely consists in the selection of a model whose response is capable 
of adequately approximating that of the unknown underlying system. 

Also with parameter estimation algorithms our choice has been quite selective. 
In fact, the main emphasis is on algorithms which admit a prediction error formu- 
lation. In particular, no description is given here of instrumental variables methods 
for which we refer the reader to the standard textbooks. 

We consider hereafter various recursive algorithms for estimating the parameters 
of a time-invariant linear dynamic system model from I/O data. 

6.3.1 Linear Regression Algorithms 

We start by assuming that the system with inputs u(k) € R and outputs y(k) G 1R 
is exactly represented by the difference equation 

A(d)y(k) = B(d)u(k) (6.3-la) 

A(d) = l + a 1 d+--- + a n J n " (6.3-lb) 
B(d) = hd+--- + b nb d n » (6.3-lc) 

where n a and rib are assigned and the no := n a + n^ parameters ai and bi are 
unknown reals. We shall rewrite (la) in the form 

y(k) = ip'{k - 1)6 (6.3-2a) 



9 := [at ■ ■■ a Ua &i • • • 



(6.3-2b) 
(6.3-2c) 



The problem is to estimate 8 from the knowledge of y(k) and f(k — 1), k £ 7L\. 
We begin by applying Theorem 1-1 so as to find the ISLM estimate of 0. 
As in Example 1-1, we set 



w : = 



p k := ip{k - 1) TO fe := y(k) 



(6.3-3) 



H : Euclidean vector space of dimension ng 
with inner product (w, pk) = p' k w 
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Here the ISLM problem of Sect. 1 amounts to finding a recursive formula for up- 
dating the minimum-norm system parameter vector 9 interpolating the I/O data 
up to time t. 

Since here dim7Y = ng, the i-innovator (1-34) consists of an ng x ng symmetric 
nonnegative definite matrix 

P(t-l):=X t = I ni -S t - 1 (6.3-4) 

P(t— 1) can be computed recursively via (1-37) as follows. First note that, because 
of (1-39), 

L t = tp'(t-l)P(t-l)tp(t-l). (6.3-5) 

Then, setting 

6{t) := 0\ t (6.3-6) 
the following algorithm follows at once from (1-44). 

Orthogonalized Projection Algorithm 

9{t) , ifL t+1 = 

<p'(t)P(t)<p{t) 



0(t + l) = { 6(t)+ ™?U [y(t +!)-?' (m)] , (6-3-7a) 



otherwise 

f P(t) , if L t+ i = 

P(t + 1) = < P(t)u>(t)u>' (t)P(t) ,v • (6.3-7b) 

1 P ( f ) - V'(t)P(t)J{t) > otherwise 

with 9(0) = O ng and P(0) = I ne . Note that L t +i = is equivalent to the condition 
<p{t) e Span {<p(k) }IZ} . 

The name of the algorithm (7) is justified by the fact that 9(t) given by (7) equals 
the orthogonal projection of the unknown 9 onto Span {<^(fc)}£l - The condition 
on Lt+i nas to be used since tp(i) can be linearly dependent on {tp(k)Yk^Q- In any 
case, after ng output observations corresponding to ng linearly independent vectors 
<p{k), the orthogonalized projection algorithm converges to the true 9 vector in the 
ideal deterministic case under consideration. As will be seen soon, in order to face 
the non-ideal case in which (7) become impractical, the algorithm can be modified 
in various ways, e.g. recursive least squares. These modified algorithms are usually 
started up by assigning arbitrary initial values to 0(0). This makes the estimates 
9(t) dependent on the chosen initialization. A correction to this procedure is to 
exploit the innovative initial portion of the orthogonalized projection algorithm so 
as to get a data-based initial guess on the unknown 9 to start up the modified 
recursions. 

By adhering to a terminology borrowed from statistics we shall refer to tp(t) 
and y(t + 1) as the regressor and, respectively, the regressand. 

Example 6.3-1 (FIR estimation by PRBS) Consider the system (1) with n a = 0, viz. y(t) = 
B(d)u(t). This is usually called a finite impulse response (FIR) system. Note that, since n a = 0, 
here ip(k — 1) = f*I^ b , = [ 6i 62 ••• b nb ]', and hence ng = n^,. We assume that the 
input signal u(t) is a specific probing signal made up of a periodic pseudorandom binary sequence 
(PRBS) [PW71], [SS89]. This signal has amplitude +V and -V and a period of L steps, with 
L, called length of the sequence, taking on the values L = 2' — 1, i = 2, 3, • • •. It is assumed 
that ni, < L, i.e. that the system memory does not exceed the sequence period. This assumption 
is only made for the sake of simplicity, being inessential in view of the results in [FN74], If the 
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2/(t)-samplcs arc used in (7) after the test input has been applied to the system for at least L 
step, thanks to the characterizing property of the PRBS autocorrelation function we have 

{Pk , PT ) = V '(r - 1M* - 1) = { % (6.3-8) 



Instead of using directly (7), we can exploit the PRBS autocorrelation function property to rewrite 

(7) in the following simplified form [Mos75] 

0(t) = 0(t - 1) + -J^^2 £t ( 6 - 3 " 9a ) 

e t = ip(t - 1) - ip(t - 2) + ottet-i (6.3-9b) 

et = y(t) - y(t - 1) + atet-i (6.3-9c) 
L-t + 3 

a* = — — (6.3-9d) 

t = 1, 2, • • ■ , L, e = tp(-2) = O nb , so = y(0) = 0. 

Assume that the system simply delays the input by 16 step. Use (9) with a PRBS of length 
L = 31 and assume also raj, = 31. Three estimates of 0(t), t = 15, 30, 31, are shown in Fig. 1. 
Note that 0(31) is an exact reproduction of the impulse response of the system since, because of 

(8) , {<p(k — 1)}?Lj is a set of linearly independent vectors in 1R 31 . 

The above discussion concerns an ideal deterministic situation. In a more realistic case, the 
system output y(t) is affected by a noise or disturbance n(t) 

y(t) = <p'(t-l)6 + n(t) 

Provided that £{n(t)} = and £{n(t)n(t + fc)} = 0, k > 31, this situation can be tackled by 
performing successively a number of separate estimates 0(31) based on 31 I/O pairs and averaging 
them. 

A simplification is to set P(t) = I ne in the orthogonalized projection algorithm. 
The resulting algorithm is called the projection algorithm. 

Projection Algorithm 

[ + ' \ 8{t) + ^ [y(t + 1) - <ff{t)e{t)\ , otherwise 
It follows from (2) that 



y(tw LSV ' " w WJ 11^)11 



\\<p(t)V MM 

6{t) :=6- 9(t) 

The estimate 0(t) of 6 is then updated by summing to 9{t) the orthogonal projection 
of the estimation error 9(t) onto </?(t)/||c/?(t)||. Fig. 2 illustrates how the algorithm 
works assuming that 9 € R 2 and 9(0) = 2 - We see that, despite the fact that ip(Q) 
and (p(l) are linearly independent, 6(2) ^ 9. Note that while the orthogonalized 
projection algorithm yields 9(2) = 9, provided that Span{<^(0), <f(l)} — H 2 , this 
holds true with the projection algorithm if and only if it also happens that (p(l) _L 
<p(0). 

An alternative to (10) which avoids the need of checking ||y(t)|| for zero is 
the following slightly modified form of the algorithm. In some filtering literature 
[Joh88] this algorithm is also known as the Normalized Least-Mean-Squares algo- 
rithm [WS85]. 
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Figure 6.3-1: Orthogonalized projection algorithm estimate of the impulse re- 
sponse of a 16 steps delay system when the input is a PRBS of period 31. 
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Figure 6.3-2: Geometric interpretation of the projection algorithm. 



Modified Projection Algorithm 

9(t + 1) = 6(t) + W + 1) - <f'(t)6(t)] (6.3-11) 

with 6(0) given and c > 0, < a < 2. 

Example 6.3-2 [LM76]. Fig. 3 reports results of simulations of recursive estimation of the 
impulse response 8 of a 6— pole Butterworth filter with cutoff frequency of 0.8 kHz and sampling 
rate equal to 2.4 kHz using as probing signal the same PRBS as in Example f . Two estimation 
algorithms are considered: the orthogonalized projection algorithm (9) (solid lines), and the mod- 
ified projection algorithm (f f) (dotted lines). As indicated in Example 1, for both algorithms the 
observations y(t) were affected by an additive stationary, zero-mean, white, Gaussian noise with 

'EfW*M 



variance a^. SNR denotes signal-to— noise ratio in dB given by SNR = f01og 10 i — ^ 

Fig. 3 shows the experimental resulting mean-square error in estimating 8. The estimation based 
on (9) was obtained by carrying out TV separate estimates of 8, each based on L = 31 input- 
output pairs and, then, averaging the corresponding estimates. In Fig. 3 the abscissa is N for the 
algorithm (9), whereas is the overall number of recursions t for the algorithm (11). The various 
curves for a given SNR correspond to different noise sequences. Note the algorithm (9) yields a 
lower residual error \\8(t)\\ 2 than (11). Further, if SNR decreases by 20 dB, ||6»(t)|| 2 decreases by 
the same amount for the algorithm (9). Note that this is not true for the algorithm (11). 

As can be seen from the above discussion, in the deterministic ideal case where no 
disturbances are present and hence (2) holds true, the Orthogonalized Projection 
algorithm and the Projection or Modified Projection algorithm have a comparable 
rate of convergence provided that initially the regressors f(k) are almost mutually 
orthogonal. Thanks to (8), this happens with PRBS's of period L large enough. 
In the general case, however, the Orthogonalized Projection algorithm exhibits a 
much faster convergence than the Projection or Modified Projection algorithm. It 
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Figure 6.3-3: Recursive estimation of the impulse response 9 of the 6-pole But- 
terworth filter of Example 2. 
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is therefore important to suitably modify the Orthogonalized Projection algorithm 
so as to retain its favourable convergence properties in non ideal cases where (2) 
does not hold true exactly any longer. For instance, the need of checking L t for 
zero at each step of the Orthogonalized Projection algorithm can be avoided by 
modifying (7) as follows 

9(t + l) = 0(t) + c + ^ Mt) W + l)-</ m t)] (6.3.12a) 

P(t+1) = Pjt) - m *®?® P ® (6.3.12b) 

where c > 0. When c = 1, (12) becomes the well known Recursive Least-Squares 
algorithm whose origin, as discussed in [You84], can be traced back to Gauss, 
[Gau63] who used the Least-Squares technique for calculating orbits of planets. 

Recursive Least Squares (RLS) Algorithm 

0(t + l) = m+ 1+ *$® Mt) W+D-Smt)] (6.3.13a) 
= 9(t) + P(t+l)<p(t)[y(t+l)-<p l (t)9(t)\ (6.3-13b) 

p{t+l) = P{t) -Twwm<) (6 - 3 3c) 

= P(t) - K(t) [1 + <p'(t)P(t)cp(t)] K'(t) (6.3-13d) 
= [I-K(t)<p'(t)]' P(t)[I-K(t)<p(t)]+K(t)K'(t) (6.3-13e) 



with 



K[t) = P^Ml (6.3-13f) 



9(0) given and P(0) any symmetric and positive definite matrix. 

Problem 6.3-1 Use the Matrix Inversion Lemma (5.3-16) to show that the inverse P^ 1 ^) of 
P(t) satisfying (13c) fulfills the following recursion 

P- X (t + 1) = p-\t) + <p(t)<p'(t) (6.3-13g) 

Proposition 6.3-1 (RLS and Normal Equations). Let {9(k)} t k=Q be 

given by the RLS algorithm (13). Then, 9(t) satisfies the normal equations 



t-i 



9(t) 



,fe=0 

t-1 



V(k)y(k + 1) + P-\0) [9(0) - 9{t)\ (6.3-14) 



k=0 

and, hence mimimizes the criterion 
■ t-i 

2 



mo) = \ j £ [y( fc + !) - ^'( fc ^ 2 + w e m\\l- Ha) ) (6.3-15) 



.fe=0 
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Figure 6.3-4: Geometrical illustration of the Least Squares solution. 



Proof Prcmultiply both sides of (12b) by P 1 (t + 1) to get 

p-\t + i)e(t + i) = p- l (t + i)e(t) + ip(t) [ y (t + i) -<p'(t)e(t)] 

= P-HWit) + <p(t)y(t + 1) [(13fl)] 

t 

= P" 1 (0)e(0) + ^ <p(k)y(k + l) 
k=0 

Further, by (13g) 

p- 1 (t+l) = p- 1 (0) + J2<P(*)'P'(k) 
k=0 

which, substituted in the L.H.S. of the prevision equation, yields (14). Note that, since by the 
assumed initialization P (0) > 0, the system of normal equation (14) has always a unique 
solution 9(t). Furthermore, it is a simple matter to check that 8(t) satisfying (14) minimizes 
Jt(9). 



In order to give a geometric interpretation to the RLS algorithm, let us consider 
the normal equations (14), assuming that P _1 (0) is small enough so as to make 
P _1 (O)[0(O) - 9(t)] negligible w.r.t. the other terms: 



Set now 



5>(fcy(*o 



,fe=0 



0(t) = 5>(fc)i/(fc+i) 



fc=0 



Y 
p'i 
8(t) 



v\ e R* 

[ Vi(0) 

[ 0i(t) 



i € n e 



where n e := {1, 2, • • • , no}. With such new notations, the above set of equations 
can be rewritten as follows 



°i W (Pi ,Pi) = {Y,Pi) , i^Re 



Comparing this equation with (1-6), we obtain Fig. 4 which is the analogue of 
Fig. 1-1 to the present case. In Fig. 4 Y denotes the orthogonal projection of Y 
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onto the subspace [p ne ] in R* generated by {pi, i £ n e } and Y :=Y — Y. According 
to (i) before (1-6), the vector 9{t) is then such that 



Y = Y. B ^)P3 



3=1 



3=1 

^(0) 



*(*) 

If, instead of (2a), we consider as equations to be "solved" w.r.t. 6 

y(k) = ip'{k-l)8 + n{k) , feet 

where n(k) represent an unknown equation error, for .P _1 (0) small enough and 
ng < dim [/?*], the RLS find a "solution" 0(t) to the above hyperdetermined system 
of equations which minimizes X)l=o [v(k + 1) — v'ifyO] 2 - 



Problem 6.3-2 (RLS with data weighting) Consider the criterion 

3' 



Me) 

St(0) 



St(0) + ^\\0-9(p)\\ 2 p- Ho) 



(6.3-16a) 
(6.3-16b) 



with c(k) nonnegative weighting coefficients and P _1 (0) > 0. Show that the vector 8 minimizing 
Jt(8) is given by the following sequential algorithm called 
RLS with data weighting 



8(t + l) 

P(t + 1) 
P-\t + l) 



9(t) 



c(t)P(tMt) 



[y(t + l)-<p'(t)e(t)] 



l + c(t)<p'(t)P(t)<p(t) 
0(t) + c(t)P(t + l)<p{t) [y(t + 1) - <p'(t)0{t)] 

_ c{t)P{tMt)y'{t)P{t) 

U l + c(i) V '(t)P(i) V (f) 
P-\t) + c®<p(t)<p'(t) 



(6.3-17a) 
(6.3-17b) 
(6.3-17c) 



Note that the Orthogonalized Projection algorithm (7) can be recovered from (17) 
by setting 



[0 for ^(k)P(kMk) = 
I oo otherwise 



(6.3-18) 



This choice is in accordance with the use in the algorithm (7) of the quantity 
Lt+i = <p'(t)P(t)<p(t) as an indicator of the new information contained in tp(t). 



RLS and Kalman Filtering 

We now give a Kalman filter interpretation to RLS. Consider the following stochastic 
model for an unknown possibly time- varying system parameter vector x(t) 

X (t+i) = x(t)+m \ , fi „ 1(n 

y(t) = V >(t-l)6(t)+at) / 
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Let (19) satisfy the same conditions as (2-37). Applying Theorem 2-1, we find 

x(t+l\t) = x(t | t- 1) + K(t) [y(t) - ip'(t- l)x(t | t- 1)] (6.3-20a) 

*<«> " , {W + ^-i)nL(.-i) <«■"»» 

Case 1 (*|(t) = O neXne , = V'c)- I n sucn a case the first of (19) becomes 

x(t + 1) = a;(i) = 0. Furthermore, setting 0(i) := z(i + 1 | i) and P(i - 1) := 
n(£)/ , i/'£, (20) become the same as the RLS algorithm (13). 

Case 2 (tf^i) arbitrary * c = ^ c ). Setting 0(i) := z(i+l | i), P(i-l) := n(t)/r/>c, 
and Q(t) := ^^(t)/ip^, (19) become the 

RLS with Covariance Modification 

" ^-^^^ + «'^) > 

Case 3 (1^(i) = 0„ eX „ e + 1) = A(i)* f (i), < A(i) < 1). Setting again 

0(i) := ar(t + 1 | i), P(i - 1) := II(i)/* c (f), (20) become the 

Exponentially Weighted RLS 

'< t + 1 > = d ^ + TTW0W) x (6 - 3 " 22a) 

[y(t + l)- ¥>'(*)*(*)] 
= 0(i) + A(t + l)P(t + %(*) [y(t + 1) - /(t)fl(t)] 



p(t + i) 



X(t + 1) 



P(t) 



1 + <pr{t)P{t)<p(t) 



(6.3-22b) 



P _1 (t+1) = A(t+1) [p- 1 (t)+^(t)^'(t)] (6.3-22c) 

The above Kalman filter solutions give valuable insights into the RLS algorithm: 
• If, as in Case 1, the regredend and regressor are related by 

y(t) = <//(*- 1)0 + C(*) (6.3-23) 

e.g. 

A(d)y(t) = B(d)u(t) + ((t) 

with A(d), B(d) and <p(t — 1) as in (1) and (2), and £ zero mean white and 
Gaussian, the distributional interpretation of the Kalman filter tells us that 
the conditional probability distribution of given y t equals 

P(9\y t )=N(6(t),* c P(t)) (6.3-24) 

where 9(t) and P(t) are generated by the RLS (13) initialized from 6(0) = 
£{9(0)} and P(0) = £{0(O)0'(O)}. In words, this means that 9(0) is what we 
guess the parameter vector to be before the data are acquired, and P(0) is 
larger the lower is our confidence in this guess. 
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• The comparison between (20) with ^(t) = O ne xn e and (17) leads one to 
conclude that RLS with data weighting are the same as the Kalman filter 
provided that c(k) — ^^(k). In words, the larger the output noise at a 
given time the smaller the weight in (16). 



• Case 2 corresponds to a time-varying system parameter vector consisting 
of a process with uncorrelated increments (or a random walk). The related 
solution (21) tells us that, in order to take into account these time variations, 
to compute P(t+ 1) the symmetric nonnegative definite matrix Q(t+ 1) must 
be added to the L.H.S. of the updating equation (13c) of the standard RLS. 
This suggests that the matrix P(t), and hence the updating gain P(t)<p(t)[l + 
(£'(i)P(£)<£(f)] _1 , is prevented from becoming too small. 



The next problem indicates why the algorithm (22) of the above Case 3 is referred 
to as the Exponentially Weighted RLS. 



Problem 6.3-3 (Exponentially Weighted RLS) Consider again the criterion (16) with (16b) 
now modified so as to make the weighting coefficient dependent on t in an exponential fashion 

1 4 

Stiff) = - 53 c (*> fc ) W) -V'( k - 2 (6.3-25a) 

2 k=i 



c(t,t) = 1 

c(t,k) = \{t — l)c(t — 1, k) 

= nAw 

i—k 



(6.3-25b) 



with 

< A(i) < 1 (6.3-25c) 

In particular, if X(i) = A, we have 

c(t,k) = \ t - k (6.3-25d) 

Show that from (25b) it follows that 

S t+1 (8) = \{t)S t (8) + i [y(t + 1) - <p'(t)8] 2 (6.3-25c) 

viz. the data in St(8) arc discounted in St+i(8) by the factor A(t). Prove that the vector 
minimizing (16) with St(8) as in (25a) is given by the recursive algorithm (26). 



6.3.2 Pseudolinear Regression Algorithms 



Recursive Extended Least Squares (RELS(PR)) (A Priori Prediction 
Errors) This algorithm originates from the problem of recursively fitting an 
ARMAX model 

A(d)y(t) = B(d)u(t) + C(d)e(t) 

to the system I/O data sequence. The above ARMAX model can be also repre- 
sented as 

y(t) = ip' e (t-l)6 + e(t) 



160 



Recursive State Filtering and System Identification 



¥>e(t-l) 



-y{t-i) 

-y(t - n a ) 
u(t-l) 

u(t - n b ) 
e(t-l) 

e(t - n c ) 



9 = [ ai ■■ ■ a na bi ■■ ■ b nb a ■ ■■ c„ c ] (6.3-26a) 

If ip e {t — 1) were available, the RLS algorithm could be used to recursively estimate 
0. In reality, all of <p e (t — 1) is known except for its last n c components. In 
RELS(PR) such components are replaced by using the a priori prediction errors 

e(k) := y(k) - ip'(k - l)6(k - 1) (6.3-26b) 



where ip(t — 1) is given here by the pseudo-regressor 



<p(t-l) = 



-y(t-i) 
-y(t-n a ) 

u(t-l) 

u(t - n b ) 
e(t-l) 

e(t - n c ) 



(6.3-26c) 



and, for the rest, the RELS(PR) is the same as the RLS (13): 



9{t + l) = 0(t) + P(t+l)<p(t)e(k + l) 
P(t)<p(t)<f/(t)P(t) 



P(t+1) = P(t)- 



i + ^(t)P(tMt) 



(6.3-26d) 
(6.3-26e) 



Recursive Extended Least Squares (RELS(PO)) (A Posteriori Prediction 
Errors) This method is the same as RELS(PR) with the only exception that here, 
instead of the a priori prediction errors e(k), the a posteriori prediction errors 



e(k) := y(k)-iff{k-\)9(k) 



(6.3-27a) 
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are used in the pseudo-regression vector (26c). Hence, (26c) is replaced here with 

-y(t-i) 



(6.3-27b) 



-y(t-n a ) 
u{t-l) 

u(t - n b ) 
e(t-l) 

e(t - n c ) 

The RELS(PR) and RELS(PO) methods will be both simply referred to as the 
RELS method whenever no further distinction is required. The RELS method was 
first proposed in [ABW65], [May65], [Pan68] and [You68]. 

The use of a posteriori prediction errors in RELS was introduced by [You74] 
and turned out ([Sol79] and [Che81]) to be instrumental for avoiding parameter 
estimate monitoring, viz., the projection of parameter estimates into a stability 
region so as to ensure the stability of the recursive scheme ([Han76] and [Lju77a]). 

In accordance with the terminology that we have already adopted, the RELS 
regressors in (26c) and (27b) are often called pseudo-regression vectors, and the 
RELS are sometimes referred to as the Pseudo Linear Regression, so as to point 
out the intrinsic nonlinearity in 9 of the algorithm, being ip(t — 1) dependent on 
previous estimates via (26b) or (27b). Another name for RELS is Approximate 
Maximum Likelihood algorithm, this choice being justified in that RELS can be 
regarded as a simplification of the next algorithm. 

Recursive Maximum Likelihood (RML) This algorithm, similarly to RELS, 
aims at recursively fitting an ARMAX model to the system I/O data sequence. 
The RML algorithm is given as follows 



6{t+l) = 9(t) + P(t + l)iP(t)s(t + 1) 

p{t)m^{t)p{t) 



p(t+i) 



P (t) 



(6.3-28a) 
(6.3-28b) 

(6.3-28c) 



\+^{t)p{t)m 

with e(t) as in (26b). In order to define tp(t), let 

C(t,d) :=l + Cl {t)d+--- + c nc (t)d n c 
where the Ci(t)'s, i — 1, • • • , n c are the last n c components of 9{t) 

0(t)=[- ai (t) ••• -a na {t) hit) ■■■ b nb (t) ci(t) ••• c c (i) ]' 

(6.3-28d) 

Then tp(t) is obtained by filtering the pseudo-regressor tp(t) in (26c) as follows 



C{t,d)i>{t)=v{t) 



or 



rl>{t) =<p{t) -ci(t)^-l) 



s (t)^(t-n c ) 



(6.3-28e) 



(6.3-28f) 
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This means that 



1>(t-l) = 



-vAt-i) 

-Vf(t-n a ) 
«/(*-!) 

u/(i - n b ) 
£/(*-!) 

. e f (t-n c ) _ 



(6.3-28g) 



where 



(6.3-28h) 



C(t,d)y f (t)=y(t) 

and similarly for and £/(t). It must be underlined that the "exact" construc- 

tion of tjj(t — 1) requires the use of the n c "fixed" filters C(t — i, d), i = 1, • • • , n c , 
e.g. C(t — i, d)yf(t — i) = y(t — i), and hence storage of the related r? c parameters. 

The above algorithm can be properly called the RML with a priori predic- 
tion errors. The RML with a posteriori prediction errors is instead obtained by 
substituting the definition of ip(t — 1) in (28g) with 

-!//(*-!) 

-yf{t-n a ) 

«/(*-!) 

Uf(t - rib) 
£/(*-!) 

_ s f (t-n c ) _ 
with £/(i) the following filtered a posteriori prediction error 
C{t,d)s f {t) = e(t) 

= y{t)-v'{t-l)6{t) 



1>(t-l) = 



(6.3-29a) 



(6.3-29b) 



Stochastic Gradient Algorithms They resemble cither the RLS or the RELS 
but have the simplifying feature that the matrix P(t + 1) in either (13b) or (26d) 
is replaced by a/TrP" 1 (t+ 1): 



0(t + l) = 6{t) 



aip(t) 



e(t+l) 



a > 



q(t+l) 

e(t+l) = y{t+\)-iff{t)B{t) 
q(t+l) = q (t) + Mt)\\ 2 

where 9(0) G R" 9 , q(0) > 0. According to the model that has to be fitted to the 
experimental data, the vector <p(t) is as in (26) for an ARX model or, alternatively, 
as in (26c) or (27b) for an ARMAX model. Stochastic gradient algorithms can be 
also considered as extensions of the Modified Projection algorithm (11). 
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6.3.3 Parameter Estimation for MIMO Systems 



In the previous part of this section we have taken the system to be SISO to simplify 
the notation. We now indicate how the parameter estimation algorithms can be 
extended to the MIMO case. This extension is straightforward and the reader 
should have no difficulty in constructing the appropriate MIMO versions of the 
previous algorithms. 

We base our discussion on a MIMO system ARMAX model A{d)y{t) = B{d)u{t)^ 
C{d)e{t) with A(d) = I p + A 1 d+--- + A n J n \ B{d) = Brf + ■ ■'■ + B nb d n ", and 
C(d) = I p + dd + ■ ■ ■ + C n J n " . We have 

y\t) = -a\y(t-l) oj, o y(t - n„) + 

b\u(t - 1) + • • • + bi h u{t - n b ) + 
c[e(t - 1) + • • • + 4 c e(i - n c ) + e\t) 
= <ff e {t-\)V + e\t) 

where the following notations are used 



(6.3-30a) 



y l (t) 



d l := [ a\ 
Then, we can write 



4>' e {t-l) := 



the i-th component of y(t) € R p , i = 1, 
the i-th row of Aj , j = 1 , • • • , n a 
the i-th row of Bj , j = 1, • • • , n& 
the i-th row of Cj , j = 1, • • • , n c 

-y(t-i) 

-y(t - n a ) 
u{t-l) 



^e(t-l) 



u(t - rib) 
e(t-l) 

e(t - n c ) 



(6.3-30b) 



°n b C l 



y(t)=<t>'(t-l)0 + e(t) 
^(t-l) 



^e(*-l) 



(6.3-30c) 
(6.3-30d) 

(6.3-30e) 



9:= [ (6 1 )' (9 2 Y ••• (ffP)' ]'e]R" e 
We indicate the related RELS algorithm with a priori prediction errors: 

6{t + 1) = ^(t) + P(t + l)(p{t)e{t + 1) (6.3-31a) 
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Pit + 1) = ph) - Pimt) [i p + <f/(t)p(t)<i>(t)] 1 m 



(6.3-31b) 



b'it-1) :-- 



<p'it-l) 



-v'it - 


- 1)- 1 


-y'it- 


- n a ) 


u'it- 


-1) 


u' it- 


rib) 


s' it- 


-1) 


s' it - 


n c ) 



v'it-i) 



v'it -I) 



(6.3-31c) 



(6.3-31d) 



e(t + l)=y(t + l)-<f/(t)0(t) 



(6.3-31e) 



6.3.4 The Minimum Prediction Error Method 

So far we have discussed in a quite informal way a number of system parameter esti- 
mation algorithms. Nevertheless, our initial thrust, the Orthogonalizcd Projection 
algorithm, was justified by considering (2) an exact deterministic system model 
relating the unknown parameter vector 9 to the I/O data. Our departure from 
an exact deterministic modelling assumption was undertaken by adopting sensible 
modifications to the initial deterministic algorithm. The resulting modified algo- 
rithms, basically variants of the RLS algorithm, were next reinterpreted as solutions 
of Kalman filtering problems for exact stochastic system models. A conclusion to 
the above considerations is that so far our basic underlying presumption has been 
the availability of an exact, either deterministic or stochastic, system model. 

The basic idea of the Minimum Prediction Error (MPE) method is to fit a 
prediction model, parameterized by a vector 9, to the recorded I/O data. The 
parameter 9 selected by the method is then the one for which the prediction errors 
are minimized in some sense. In this way the search for a true parameterized 
model is abandoned, and is sought instead the best parameterized predictor in a 
given class. Consequently, the MPE method focus attention on the approximation 
of the observed data through models of limited or reduced complexity. Our main 
goal is to introduce the MPE method and show that the majority of the estimation 
algorithms discussed so far can be derived within the MPE framework. 

The kind of approximation that is sought in the MPE method is motivated by 
the fact that in many applications the system model is used for prediction. This is 
often inherently the case for control system synthesis. Most systems are stochastic, 
viz. the output at time t cannot be exactly determined from I/O data at time t—l. 
We have already touched upon the topic in Problem 2-2 for stochastic state-space 
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descriptions and Kalman filtering. Here we denote by y(t | t — 1; 6) the (one-step- 
ahead) prediction of the system output y(t). y(t \ t — l;8) depends on both the 
I/O data up to time t — 1 and the model parameter vector 9. The rule according 
to which the prediction is computed is called the predictor and 

e(t, 9) := y(t) -y(t\t-l; 6) (6.3-32a) 

the prediction error. It is therefore appealing to determine 9 by minimizing the 
cost 

1 M 

J M(0) = Tj£lkM)|| 2 Q (6.3-32b) 
t=i 

with Q = Q' > 0. In the discussion following Proposition 1 we have seen that 
for P _1 (0) small enough and M large, the RLS algorithm tends to minimize (32b) 
when 

y(t | t- 1;0) = <p'(t- 1)6 (6.3-33) 

with ip(t — 1) known at time t and hence independent on 9. 

A model parameter vector 9 obtained by minimizing of (32b) is called a mini- 
mum prediction error (MPE) estimate. 



Example 6.3-3 {Prediction for an ARMAX model) Consider the ARMAX model (2-69) 

A(d)y(t) = B{d)u(t) + C{d)e(t) (6.3-34a) 

or 

y(t) = [I p - A(d)} y(t) + B{d)u(t) + [C(d) - I p ] e(t) + e(t) 
Since the innovation e(t) at time t can be computed in terms of m* -1 via 

e(t) = C- 1 {d)A(d)y(t) - C- 1 (d)B(d)u(t) (6.3-34b) 

a reasonable choice is to set 

y(t\t-l;0) = {I p -A(d)]y(t) + B(d)u(t) + [C(d)-I p ]e(t) 

= C- 1 (d){[C(d)-A(d)]y(t) + B(d)u(t)} [(34b)] 

or 

C(d)y(t \ t-l;6) = [C(d) - A(d)] y(t) + B{d)u{t) (6.3-34c) 
Prom (32a) it then follows that 

C(d)e{t, 0) = A(d)y(t) - B(d)u(t) (6.3-34d) 

Here 8 is the vector collecting all the free entries of the matrices A(d), B(d) and C(d) which 
parameterize the ARMAX model (34a). Notice that C(d) is not completely free being required, 
according to Theorem 2-4, to be strictly Hurwitz. Hence, from (34d) and (34a) it follows that 
e(t,8) = e(t) under the choice (34c) for y(t \ t — 1;8). It is a simple exercise to see that (34c) 
yields the MMSE prediction of y{t) based on u* -1 provided that (34a) is a correct model 

for the I/O data. In fact, if we let y(t) to be any function of M t_1 , the minimum of 

£{\\y(t)-m\\ 2 Q } = e{\\y(t\t-l;e) + e(t)-y(t)\\ 2 Q } 

= e{\w\t-i;0)-m\\%}+£{\\<t)\\%} 

is attained at y(t) = y(t \ t — 1; 8). 

It is to be remarked that (34c) does not allow y(t | t— 1; 6) to be expressed in terms 
of a finite numbers of past I/O pairs. This only happens when C(d) = I p and hence 
the ARMAX model (34a) collapses to the ARX model A(d)y(t) = B(d)u(t) + e(t). 
As seen after (32b) , in the latter case the MPE estimate is given as M — * oo by the 
RLS estimate, initialized by a small P _1 (0). 
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Figure 6.3-5: Block diagram of the MPE estimation method. 



Fig. 5, where the "Process" indicates the real system with input u(t) and 
output y(t), provides an illustration of the MPE method. To be specific, we 
have indicated in (32b) only one possible form of the cost to be minimized. An- 
other possible choice, motivated by Maximum Likelihood estimation [SS89], is 



M 



M" 1 det 



In the special case where e(t, 9) depends linearly on 9 the minimization of Jn{9) 
can be carried out analytically. This is the case of linear regression which can be 
solved via off-line least-squares or the RLS algorithm. In most cases the mini- 
mization must be performed by using a numerical search routine. In this regard, a 
commonly used tool are the Newton-Raphson iterations [Lue69] : 



(6.3-35a) 



where 9^ denotes the fc-th iteration in the search, (#( fc )) the gradient of Jn(9) 
w.r.t. 9 evaluated at 9^ 



and jff (#( fe )) the Hessian matrix of the second derivatives of Jn{9) evaluated at 

r(2) Uk)\ ._ d 2 J M (9) 

Jm v J - d9 2 



e=8( k ) 



Referring to (32b) and, for the sake of simplicity, to the single output case, we find 
for Q = 1 



J ( m\0) 



M 



M 



^V(M)eM) 



(6.3-35b) 



t=i 
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V>M) = - 



de(t, 0) 



de 

M 



de 
90i 



de 



M 



H(t,0) = 



t=i 

,2 



d 2 e{t,6 

d 2 e 



(6.3-35c) 
(6.3-35d) 
(6.3-35e) 



38x802 



d z e 



ae ng de 1 de ne de 2 



d9id0„ 



Suppose that the real system is exactly described by the adopted model for 8 = 8 a 
in the sense that 

y(t)=y(t\t-l;0 o )+e(t) (6.3-36) 

where 6 denotes the true parameter vector. Then, e(t,6o) — e(t), with {e(t)} as 
in (2-69). Note that the entries of H(t, 9) only depend on y 1 ^ 1 , u t_1 . Hence under 
stationariety and the usual ergodicity conditions [Cai88], the second term on the 
R.H.S. of (35d) vanishes for 6 = 6 . If we extrapolate such a conclusion for any 0, 
(35a) becomes 



e (k+i) = Q {k) + 



M 1 1 [M 



Lt=l J L*=l 

These arc called the Gauss-Newton iterations. 



(6.3-37) 



Problem 6.3-4 (Least Squares as a MPE Method) Consider the linear regression model (33). 
Show that (37) yields a system of normal equations that for every k gives an off-line or batch 
least squares estimate of 8. 

Example 6.3-4 (Gauss-Newton iterations for the ARM AX model) Consider again Example 3 
for a SISO ARMAX model. First differentiate (34d) w.r.t. <jj to get 



OW )*M =tf ( t _ 

aoi 



i = l,- 



Next, diffcrcntation of (34d) w.r.t. 6j gives 





% 


= 1,- • • ,n b 


Similarly, differentiate (34d) w.r.t. Cj 


to get 






e(t i,6) + C(d) d£(t ' 9) =0 
oci 


i = 1,2, • • • ,n c 


If we set 








6 := [ ai ■■■ 


In, &1 




ci • • • c„ c 


we find for (35c) 


■ -»(*-l) ■ 




■ -!//(*- 1) ■ 


^(t,0) = — ?— 


-j/(t - n a ) 
u(t - 1) 

tt(t - Tlj,) 

e(t - 1,0) 
. e(t - n c , 0) . 




-J//(t - "a) 
«/(t-l) 

Uf(t- n b ) 
6f(t - 1,0) 

. £ f (t - n c ,6) _ 



(6.3-38a) 

(6.3-38b) 

(6.3-38c) 
(6.3-38d) 



(6.3-38e) 



With yf(t) as in (28h). The above expression should be compared with (28g) used in the RML 
algorithm (28). It elucidates the operations that must be carried out at each iteration of (37). 
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The Gauss-Newton iterations yield properly an off-line or batch estimate of 0. 
However, these iterations can be suitably modified by using further simplifications 
so as to provide recursive algorithms. For instance, by recursively minimizing at 
time t the following exponentially weighted cost 



Jt{0) 



£a 4 - 

k=l 



Uk,6)\\l 



(6.3-39a) 



the following Recursive Minimum Prediction Error (RMPE) algorithm can be ob- 
tained [SS89]: 



6(t + l) 
P(t+1) 

e(t) 

m 



6(t) + XP(t + l)^(t)Qe(t + 1) 



l -{p(t)-p(t)m 
4>'{t)p{t)} 

e(t,0(t-l)) 



de(t,6) 



d9 



e(t-i) 



80! 
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00! 



de v 



(6.3-39b) 

(6.3-39c) 
(6.3-39d) 

(6.3-39e) 



e(t-i) 



The RMPE algorithm (39) can be applied to different prediction models. It is 
simple to see that for a linear regression model it gives the RLS with exponential 
forgetting factor A, Cf. (22). At the light of the results of Example 4, particularly 
(38c), it is not surprising that (39) applied to a SISO ARMAX model yields, once 
suitable simplifications are made, the RML algorithm (28) with forgetting factor 
A. 

Problem 6.3-5 (RMPE and RML algorithms) Consider the ARMAX model of Example 3 and 
the related results of Example 4. Find the simplifications that arc needed to make the RMPE 
algorithm for A = 1 coincident with the RML algorithm (28). 



6.3.5 Tracking and Covariance Management 

There are several issues that must be taken into account in the practical use of the 
recursive estimation algorithms introduced in this section. Though we discuss one 
of them with reference to the RLS algorithm, it is common to the other recursive 
algorithms as well. 

An important reason for using recursive estimation in practice is that the system 
can be time-varying, and its variations have to be tracked. The adoption of the 
Exponentially Weighted RLS (22) related to the minimization of (16a) with 

St(0) = \Y, ^ iv( k ) - V'(k l)6f (6.3-40a) 

fe=i 

where A G (0, 1) seems to be a quite natural choice. In this case A is called the 
forgetting factor. Since A fe = e felnA = e~ k ( 1 ~ x \ the measurements that are older 
than 1/(1 — A) are included in the criterion with weights smaller than e _1 w 36% 
of the most recent measurement. Therefore, we can associate to A a data memory 
M 

M = — !— (6.3-40b) 
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Roughly, M indicates the number of past measurements which the current estimate 
is effectively based on. Typical choices for A are in the range between 0.98 (M = 50) 
and 0.995 (M = 200). 

Problem 6.3-6 Consider the exponentially weighted RLS (22) with X(t + 1) = A, A 6 (0, 1). 
Suppose that the regressor sequence {v?(fc)}jj.~Q lies in a hyperplane of dimension lower than rig. 
Show that as t — ► oo P _1 (t) becomes singular, and hence P(t) diverges, irrespective of P~ 1 (0). 

As the above problem suggests, potential difficulty with Exponentially Weighted 
RLS is the so-called covariance wind-up phenomenon. If the regressor vectors bring 
little or no information, viz. according to the comment after (18) P(k)ip(k) w 0, it 
follows from (22) that P(fc + 1) w P(k)/\. The forgetting has therefore the effect of 
decreasing the size of P(k) from one recursion to the next. Then, if no information 
enters the estimator over a long period, the division by A at every step causes P(k) 
to become very large, leading to erratic behaviour of the estimates and possibly 
numerical overflow. 

According to the above, Exponentially Weighted RLS must be careful used. The 
main idea is to ensure that P(k) stays bounded. In particular, whenever possible, 
a dither signal should be added to the system input so as to prevent the algorithm 
from incurring into the covariance wind-up phenomenon. Another possibility is to 
equip RLS with a covariance resetting logic fix according to which P(k) is reset 
to a given positive definite matrix, whenever its value computed via (22b) gets 
too small. A useful procedure, viz. the dead-zone fix, is to stop the updating of 
the parameter vector and the covariance matrix when P{k)<p{k) and/or e(k) are 
sufficiently small. We next focus on specific mechanisms for preventing covariance 
wind-up, such as directional forgetting and constant trace RLS. 

Directional Forgetting RLS In Exponentially Weighted RLS the covariance 
wind-up phenomenon is caused by the fact that at each updating step the nor- 
malized information matrix P _1 (t) is reduced by the multiplicative factor A in all 
directions in R ne , except along the direction of the incoming regressor <p(t) where to 
AP _1 (t) is added the matrix Xip(t)ip' (t). In directional forgetting [Hag83], [KK84], 
[Kul87], the idea is to modify P _1 (£) only along the direction of the incoming 
regressor according to the formula 

P- 1 ^ + 1) = P- 1 ^) + v(t)<p(t)<p'{t) (6.3-41a) 

This should be compared with (22c). In (41a) rj(t) is a real number to be suitably 
chosen under the constraint that P _1 (i + 1) > provided that P -1 (£) > 

Problem 6.3-7 Let P" 1 = P~ T > and P" 1 = P" 1 + ntptp', with rj 6 R. Show that P" 1 > 
if and only if 

^- < V (6.3-41b) 

[Hint: Prove that (41b) implies P > 0. Next, show that if (41b) is not true, the vectors x = Pip 
make x'p- 1 x < 0. ] 

In [KK84] the following choice for rj(t) is derived via a Bayesian argument 

n(t) = A A - - (6.3-41c) 

Here A plays a role similar to that of a fixed forgetting factor. Note that n(t) as 
defined above satisfies the inequality (41b). The RLS with directional forgetting 
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(41c) update the ^-estimate as in (13a) with 

ph + 1) = P (t) - f (f ftyf )f f ) (6 .3_4i d) 

the latter being obtained from (41a) via the Matrix Inversion Lemma (5.3-16). 

Constant Trace RLS A constant covariance trace algorithm can be built out 
of the RLS with Covariance Modification (21) by simply choosing Q(t+ 1) so as to 
make Tr P(t + 1) = TrP(t) = TrP(O), viz. 

TrQ{t + 1} - i + v'(t)P(tMt) (6 ' 3 " 42a) 

One possible choice for Q(t+ 1) is then to set 

v>{t)p\tMt) 

Q[t+l) n e [l + ^>(t)P{tMt)] lna ^ Alb) 

An alternative to the above is to start with the Exponentially Weighted RLS 
and choose the time- varying forgetting factor X(t + 1) so as to make the covariance 
trace constant, viz. 

\(t + i)-i 1 ¥W a (*M*) f6 3 43) 

A(t + 1} - 1 ~ TrP(0)l + v>(t)P(tMt) (6 - 3 " 43) 

It is to be pointed out that, though we have described Directional Forgetting RLS 
and Constant Trace RLS as algorithms for coping with the covariance wind-up 
phenomenon, they turn out to be also suitable for estimating time varying param- 
eters. A similar remark can be applied to other estimation methods for covariance 
management such as the ones based on covariance resetting [GS84] and the variable 
forgetting factor of [FKY81]. 



6.3.6 Numerically Robust Recursions 

The recursive identification algorithms, as they have been given so far, are known 
not to be numerically robust. In particular, the RLS algorithm (13) hinges on 
(13c)-(13d) which is seen to be a Riccati equation. By Problem 2-3, this is the 
dual of the Riccati equation relevant for the LQOR problem. Then, from our 
discussion in Sect. 2.5, it follows that (13e) is the numerically robustificd form 
of the Riccati recursions for RLS. The use of (13e) in the RLS algorithm yields 
in most circumstances completely satisfactory results. Nonetheless, more robust 
numerical implementations are obtained by factorizing the "covariance matrix" 
P(t) in terms of a "square root" matrix S(t), viz. P(t) = S(t)S'(t). The RLS 
algorithm can be implemented by updating S(t) in each recursion. This is roughly 
equivalent to computing P(t) in double precision and ensures that P(t) remains 
positive definite. If the rounding errors are significant, implementations based on 
factorization methods yield definitely superior results than the ones achievable with 
the standard RLS algorithm (13). 

We next describe RLS recursions based on the U-D factorizcd form 



P{t) = Uit)D{t)U'it) 
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where U(t) is an ng x ng upper triangular matrix with unit diagonal elements and 
D(t) a diagonal matrix of dimension ng. The recursions are obtained by slightly 
modifying the U-D covariance factorization in [Bie77] so as to consider the expo- 
nentially weighted RLS (22). 

U — D Recursions for Exponentially Weighted RLS Let 

P(t - 1) = UDU' and 9(t - 1) = 9 (6.3-44) 
U = [ m ■■■ u ne ] D = diag{(5j,z e n e } 

Denote by 

y = y(t) and ip = ip(t - 1) (6.3-45) 
the regressand and, respectively, the regressor at time t. Then 

P(t) = UDU' and 6{t) = 6 + K(y - ip'6) (6.3-46) 

U=[u 1 ■■■ u ne ] D = diag|(5j,z e n e | 

are generated as follows. 



Step 1 Compute the vectors / and v 



f=[h ■■■ fn,]= U'if 

v=[v! ■■■ v ne ] = Df 



Step 2 Set 



(6.3-47) 



Si = ^ ai = 1 + K 2 =[v! Oix (ne-i) }' (6.3-48) 

• Step 3 For i = 2, • • • ,ng recursively cycle through (49)-(53): 

a, = aj_i + Uj/i (6.3-49) 

St = (6.3-50) 

^ = ^- (6.3-51) 

CXi-i 

Ui = Ui + HiKi (6.3-52) 

K i+1 = K, + v lUl (6.3-53) 

• Step 4 Compute 

K = ^=2±i (6.3-54) 
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Main points of the section Several parameter estimation algorithms have been 
introduced for recursively identifying dynamic linear I/O models, such as FIR, 
ARX or ARMAX models. These algorithms can be classified as cither linear or 
pseudo-linear regression algorithms, according to the independence or dependence 
of the regressor on the estimated vector. The majority of the algorithms considered 
admit a Minimum Prediction Error formulation and hence can be seen as tools to 
fit the observed data by models of limited and reduced complexity. The need 
of discounting old data suggests to adopt suitable provisions aimed at ensuring 
boundedness of the P(t) matrix. These include fixes like: dither signals; covariancc 
resetting; deadzones; directional forgetting; constant trace; or combinations thereof. 
In order to enhance numerical robustness, the recursions have to be carried out via 
factorization methods, e.g. the U-D cstimatc-covariancc updating algorithm. 

6.4 Convergence of Recursive Identification Algo- 
rithms 

In this section we give an account of the convergence properties of the recursive esti- 
mation algorithms introduced in Sect. 3. The discussion is carried out in detail only 
for the RLS algorithm, the results for the other algorithms being briefly sketched. 
The main idea is to describe some convergence analysis tools applicable to a great 
deal of recursive stochastic algorithms, in connection with the most frequently used 
identification method in adaptive control applications, viz. the RLS algorithm. We 
consider the RLS algorithm first in a deterministic and next in a stochastic setting. 
Finally, we state convergence results for some pseudo-linear regression algorithms. 

We point out from the outset that in order to prove convergence to the "true" 
system parameter vector 9 some strong assumptions have to be made. In particular: 

i. the system model (e.g. FIR, ARX, ARMAX) and its order must be exactly 
known; 

ii. the inputs must be persistently exciting in a sense to be clarified; 

iii. mean-square boundedness is required in the RELS(PO) convergence proof. 

We point out that such properties cannot be a priori guaranteed in adaptive control 
schemes whereby the analysis must be carried out without relying on the conver- 
gence of the identifier. 

There have been three major approaches to the analysis of recursive identifica- 
tion algorithms: 

(1) Ordinary Differential Equation (ODE) Analysis This method consists 

of associating a system of ordinary differential equations to a recursive algo- 
rithm in such a way that the asymptotic behaviour of the latter is described 
by the state evolution of the first. We do not introduce the ODE method 
here and postpone its description and use in subsequent chapters dealing 
with adaptive control. 

(2) Analysis via Stochastic Lyapunov functions The analysis is carried out 

by the direct construction of a positive supermartingale so as to exploit ap- 
propriate martingale convergence theorems. A positive supermartingale (or 
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a stochastic function closely related to it) can be seen as the stochastic ana- 
logue of a Lyapunov function of deterministic stability theory. The Lyapunov 
function methods are the ones mainly used throughout this section for both 
deterministic and stochastic convergence analysis of RLS. The reader can thus 
usefully compare the two developments to find out similarities and differences 
in the two cases. 

(3) Direct Analysis In some cases the method of analysis does not follow any of 
the two approaches above and is specifically tailored to the algorithm under 
consideration. See, for instance, the convergence proof of RLS originally 
obtained by [LW82] and exposed in [Cai88]. 



6.4.1 RLS Deterministic Convergence 

Throughout this subsection we assume that (3-1) and (3-2) hold true. This means 
that there arc no modelling errors, the I/O data are noise-free, and there is a true 
system parameter vector 9 £ R™" to be determined. Setting 



R(t) :=1 £>(%/(£;) 



(6.4-la) 



k=0 



the system of normal equations (3-14) yields for the RLS estimate 



0(t) = 



R(t) + -P- 1 (0) 



R(t) + -P- 1 (0) 



± +1) + V^OMO) 



fc=0 



R(t)6+-p-\0)0(0) 



[(3-2)] (6.4-lb) 



Then we see that if R(t) converges to a bounded nonsingular matrix as t — > oo, 
6(t) converges to the true system parameter vector 9. While nonsingularity of R(t) 
for large t is unavoidable for establishing convergence and, as we shall see soon, is 
related to the notion of a sufficiently "exciting" regressor, boundedness of R(t) as 
t — > oo entails stability of the system to be identified. We consider next a different 
tool for RLS analysis which does not require system stability. 
Setting 

6{t) := 9{t) - 9 



we can write 

e(t) := y(t)-<p'(t-l)0(t-l) 
= -<p'(t-l)0(t-l) 

Subtracting 9 from both sides of (3-13a,b), we get 



i + <f/{t)p{t) V {t) 

= 0(t)-P(t+l)<p(t)<p'(t)9(t) 
- [I ne -P(t+lMtW(t)]~9(t) 
= P(t+l)P-\t)9(t) [(3-13g)] 



(6.4-2a) 



(6.4-2b) 



(6.4-2c) 

(6.4- 2d) 
(6.4-2e) 
(6.4-2f) 
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Since the RLS estimation error 9(t) satisfies the difference equation (2), 9(t) con- 
verges to O ne for any 9(0) and P (0) provided that (2) and (3-13g) is an asymptoti- 
cally stable system. In order to find out conditions under which this is guaranteed, 
we use a Lyapunov function argument [SL91]. We first exhibit the existence of a 
Lyapunov function V(6(t)) for (2) and (3-13g), viz. a nonnegative function of 9(t) 
which is nonincreasing along the trajectories of (2) and (3-13g). Next, we find suf- 
ficient conditions under which convergence of V(9(t)) implies convergence of 9(t) 
to O ne . 

Theorem 6.4-1 (RLS convergence). Let the y and ip sequences be as in (3-1) 
and (3-2). Then the nonnegative function 

V(t) :=0'(t)P- 1 (t)0(t) (6.4-3) 

is nonincreasing along the trajectories of (2) and (3-13g). Further, provided that 

lim X mhl [P-\t)} = oo (6.4-4) 

t^QO 

the RLS estimate 6(t) converges to 9 as t — > oo, for all 9(0) and P(0) = P'(0) > 0. 

Proof By (2f), (3) can be rewritten as follows 

v(t) = e'(t)p- 1 (t - i)6(t - l) 

Consequently, 

v(t)-v(t-i) = [e(t)-e(t-ij\' p-ift-iwt-i) 
e'(t-i)<p(t-i)<p'(t-i)d(t-i) 



i + <p'(t-i)P(t-i)<p(t-i) 

e!w 

l + ¥>'(t-l)P(t-l)¥>(t-l) 



[(2c)] 

[(2b)] (6.4-5) 



Hence, V(t) is nonincreasing. Being also nonnegative, V(t) converges to a bounded limit as 
t — > oo. Therefore, 

M> lim 6' (tjP- 1 (t)9{t) > lim \ X^p- 1 (t)} ■ \\9(t)\\ 2 } 

t — *oo t — »oo I J 

for some M > 0. Hence, if (4) is fulfilled, §(t) -> O ne for any 8(0) and P(0) = P'(0) > 0. 

The condition (4) is guaranteed provided that 



lim < A min 

t— >oo I 



J2^'(k) 



.k=0 



= oo (6.4-6) 



In order to relate (6) to the system input sequence, we begin with considering a 
FIR system whereby 

y(t) = B{d)u(t) (6.4-7a) 
= <p?(t-i)e 

<p'(t-l) = [u(t-l) ■■■ u(t-n b )] (6.4-7b) 
9' = [h ■■■ b nb ] (6.4-7c) 
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We say that the input signal {u(t)} is persistently exciting of order n if 

u(t- 1) 



1 N 

p\I n > Jim — V 
t=l 



u(t — n) 



[w(t-l) ••• M(t-n)]>p 2 in (6.4-8) 



for some pi > P2 > 0. For n > this condition implies (6) and, hence, RLS 
convergence according to Theorem 1. 

Problem 6.4-1 (RLS rate of convergence) Prove that for the deterministic FIR system (7) 
||6(t)|| 2 converges at least at the rate 1/t provided that the system input is persistently exciting 
of order rif, . However, as can be verified using (2) via a scalar example where tp(t — 1) = 1, 9(t) 
converges at the rate 1/t. 

It can be shown [GS84] that a stationary input sequence whose spectrum is nonzero 
at n points or more is persistently exciting of order n. In particular, this happens 
to be true for an input of the form u(t) = J2i=i v i sin(w i t+a i ), Wj e (0, it), llh ^ uj, 
Vi ^ 0, and s > n/2. 

For the general deterministic recurrent system (1), RLS convergence properties 
are similar to the ones valid for the FIR system (7). In particular, the following 
result applies. 

Fact 6.4-1. [GS84]. Let y(-) and ip(-) sequences be as in (3-1) and (3-2). Then, 
if A{d) is strictly Hurwitz, the RLS estimate 6(t) converges to the true parameter 
vector provided that: 

• the input {u(t)} is a stationary sequence whose spectral distribution is nonzero 
at n a + nb points or more; 

• the polynomials A(d) and B(d) are coprime. 

A similar convergence result [GS84] is available if the system (3-1) is unstable 
and its input u(t) is given by a non necessarily stabilizing but piecewise constant 
feedback component Fip{t— 1) plus an exogenous signal v(t) = X^j=i v i sin(wii+aj), 
u>i € (0, it), LOi ^ u>j, Vi 7^ 0, s > An. This convergence result is subject again 
to the condition that A(d) and B(d) are coprime polynomials. We point out the 
importance of the latter condition. In fact, it is basically an identifiability condition 
in that it makes the representation (3-1) well-defined on the grounds of the I/O 
system behaviour. Further, in the light of Problem 2.4-5, the above condition is 
equivalent to the reachability of the state <p(t) of the system (3-1) (Cf. Lemma 
5.4-1). It is intuitively clear that reachability of ip(t) is a key property that has to 
be satisfied in order that (6) be possibly achieved via a persistently exciting input 
signal. 

A deterministic convergence analysis for the Exponentially Weighted RLS is 
reported in [JJBA82], where it is shown that, under persistent excitation, this 
algorithm, unlike the A = 1 case, for A G (0, 1) is exponentially convergent (see 
also [Joh88]). Exponential convergence is important in that it implies tracking 
capability for slowly varying parameters [AJ83]. However, as we have seen at the 
end of Sect. 3, other problems arise when A < 1 with the Exponentially Weighted 
RLS algorithm whenever persistent excitation conditions are not satisfied. 

For a deterministic convergence analysis of Directional Forgetting RLS see 
[BBC90a] . A constant trace normalized version of RLS is analysed under determin- 
istic conditions in [LG85]. This analysis is reported in Sect. 8.6 where the algorithm 
is used in adaptive control schemes. For conditions that guarantee convergence of 
the Projection algorithm in the deterministic case the reader is referred to [GS84] . 
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6.4.2 RLS Stochastic Convergence 

We first consider the RLS algorithm under the limitative assumption that y and u 
are finite variance or square integrable strictly stationary ergodic processes. Hence 
[Cai88] 



1 

w I ™ ]vE¥'( fe - i y( fc - 1 )= f M t V( t )} = * v a.s. (6.4-9a) 
fe=i 

where 

ip{t-l):=[y'{t-l) ■■■ y'(t - n„) u'(t - 1) • • • u'(t - n 6 ) ] ' (6.4-9b) 
Further, let 

tf „ >0 (6.4-9c) 

We make no assumption on how y and u are related. In particular, the underlying 
system whose input and output variables are the u and respectively the y process 
need not be linear or exactly described by an ARX model with n a = dA(d) and 
n b = dB(d). 

Consider next the orthogonal projection £{y(t) \ <p(t— 1)} of y(t) onto [<p(t— 1)], 
the subspace of L 2 (fl,J : ,F) (Cf. Example 1-2) spanned by the random vector 
ip(t - 1). It results (Cf Problem 1-2) 

£{y(t)\ip(t-l)} = 5{i,(ty(t-l)}*-V(t-l) (6.4-10a) 

= e'ip(t-i) 

where 

°9:= W(t - l)y'(t)} g R ne (6.4-10b) 

Note that 

f| y(t)- 6'(f(t-l) (p'(t- 1)^=0 (6.4-10c) 

o 

The following theorem relates the RLS estimate to the above vector 0. 



Theorem 6.4-2 (RLS Consistency in the Ergodic Case). Let y and u be fi- 
nite variance strictly stationary ergodic processes, and, consequently, (9a) be ful- 
filled. In addition, let (9c) hold. Then, 

o 

i. The vector 9 given by (10b) is the unique vector that parameterizes the or- 
thogonal projection of y(t) onto [ip(t — 1)] according to (10a) or (10c). 

o 

ii. For each f S 2i there is a unique solution 0(t), the RLS estimate of 0, to the 
normal equations (3- 14-). 

o 

Hi. The RLS estimator is strongly consistent, viz., 6(t) converges to a.s. as 
t — > oo. 
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Proof For i. and ii. sec Sect. 1 and, respectively, Proposition 3-1. Setting 



R(t) := ± £ 



from (3-14) we get by ergodicity 



lim 6{t) 

t — >oo 



lim 

t— *oo 



lim R(t) 

t — >oo 



/ t-i 
lim - ¥>(*)»'(* + 1) 



^-^{^(fcycfc + i)} =? 



(6.4-11) 



The relevance of Theorem 2 is that it tells us that, under ergodicity, the RLS- 

based one-step output predictor y(t | t — l;9{t)) := 9'(t)ip(t — 1) converges a.s. to 

o ° 

y(t\t- 1; 9) =9' ip(t - 1) the MMSE estimator of y(t) based on y t_1 , w t_1 , among 
all estimators of the form 9'ip(t — 1). This result consolidates a similar observation 
made after (3-31b). 

In the last line of the above proof we have used the ergodicity property (9a) and 
(9c). Comparing this with (8), we see that (9c) can be interpreted as a persistency 
of excitation condition for the present ergodic situation. When (9c) is satisfied, <p 
is said to be a persistently exciting regressor. Under these circumstances, if ip{k) is 
as in (7b), u is said to be persistently exciting of order n\,. 

Suppose now that the data generating system is as in (3-1) and (3-2) with A(d) 
strictly Hurwitz and u ergodic. Then, tp(k) as in (3-2b) is a persistently exciting 
regressor vector if A(d) and B{d) are coprime and u is a persistently exciting input 
of order n a + rib [SS89] . Let the data generating system be given by a perturbed 
version of the difference equation (3-1): 



A(d)y(t) = B(d)u(t) + v(t) 



(6.4-12) 



In (12) v(t) represents the "disturbance" or the "equation error". It is assumed 
that u and v are ergodic, £{u(t)v(r)} = for all t and r, and A(d) is strictly 
Hurwitz. Then, tp(k — 1) as in (3-2b) is a persistently exciting regressor, provided 
that u is persistently exciting of order nb, and v is persistently exciting of order 
n a [SS89]. Note that the latter condition is always fulfilled if v(t) = H(d)e(t) with 
H(d) a rational transfer function and e(t) white. 
Rewrite (12) as follows 



y{t) = #<p{t - 1) + v{t) 



(6.4-13a) 



Consequently, (10b) becomes 



9=9+% 1 £{rtt-l)v'(t)} 



(6.4-13b) 



Hence, under the stated assumptions, the RLS estimator converges a.s. to the "true" 
parameter vector 9, in which case we say that RLS estimator is asymptotically 
unbiased, if and only if the equation error v(t) is uncorrelated with the regressor 
p(t-l) 

£{<p{t- l)u'(t)} = (6.4-13c) 
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Problem 6.4-2 Assume that the data generating system is given by the ARMAX model 

A{d)y(t) = B{d)u(t) + C(d)e(t) 

with dA(d) = n , dB(d) = rn, and dC'(d) > 1. Consider the RLS estimator with regressor 
ifi(k — 1) = [ —y(k — 1) • • • — y{k — n a ) u(k — 1) • • • u(k — iif,) ] '. Assume that A(d) is 
strictly Hurwitz and u and e finite variance crgodic processes with u persistently exciting of order 
rij,. Prove that such an estimator of = [ a\ ■■■ a na b\ ■■■ b nh ] is asymptotically 
unbiased if £{u(t)e(r)} = for all t and t, provided that n a = 0. On the opposite, show that 
(13c), and hence the above property, does not hold true if n a > and/or £{u(t)e(r)} = only 
for all t < t. 

Problem 2 points out that in general the RLS estimator is asymptotically biased, 
viz. it is not consistent with the "true" 9 vector. Just to mention a few relevant 
cases, such a difficulty is met, even when ip is a persistently exciting regressor, 
under the following circumstances: 

• n a > 0, viz. the data generating system is not FIR, and the equation error 
{v(t)} in (12) is not a white process; 

• n a and/or nj, are chosen too small, and hence v(t), depending on past I/O 
pairs, is correlated with Lp(t — 1). 

Another important situation which prevents RLS from being consistent is the loss 
of regressor persistency of excitation that typically takes place when the input u(t) 
is solely generated by a dynamic feedback from the output y(t). 



We now turn on to analyse the RLS algorithm in the stochastic case under no 
ergodicity assumption. As anticipated in the beginning of this section, to this end 
we follow the stochastic Lyapunov function (or stochastic stability) approach. We 
limit our analysis to the RLS algorithm, taken here as a representative of other 
identification algorithms, such as pseudo-linear regression algorithms, for which, 
nonetheless, we shall indicate the conclusions achievable via a similar convergence 
analysis. The reader is referred to Appendix D for the necessary results on mar- 
tingale convergence properties which will be used in the remaining part of this 
section. 

We assume that the data generating system is given by the SISO ARX model 

A(d)y(k) = B(d)u(k) + e(k) (6.4-14a) 

with A(d) and B(d) as in (3-1) and e(k) the equation error, or cquivalently 

y{k) = ip'{k - 1)9 + e{k) (6.4-14b) 

with (f(k — 1) and 9 as in (3-2). The stochastic assumptions are as follows. The 
process {<p(0), z(l), z(2), ■ ■ •}, z(k) := [ y(k) u(k) ] , is defined on an underlying 
probability space (f2, !F, P), and we define JF to be the cr-field generated by {y(0)}. 
Further, for all t € 7L\ T t denotes the cr-field generated by {(p(0),z(l), ■ ■ ■ , z(t)} or, 
cquivalently, {f(0), </?(l), • • • , f(t)}. Consequently, JF c T t C Ft+u t € 1L\. The 
following independence and variance assumptions are adopted on the process e: 



£{e(t) | T t -i} = a.s. 
£{e 2 {t) \T t -i) =o 2 a.s. 



(6.4-14c) 
(6.4-14d) 
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for every t € 7L\. Note that, by the smoothing properties of conditional expecta- 
tions, (14c) and (14d) imply that {e(t)} is zero-mean and white. 

Theorem 6.4-3. (RLS Strong Consistency) Consider the RLS algorithm (3- 
13) applied to the data generated by the ARX system (14). Then, provided that 



i. persistent excitation 



lim A min [P _1 (t)l = oo (6.4-15a) 



ii. order condition 

lim sup ^ Ip-uJl < 00 (6. 4- 15b) 

t^oo A min [f L {t)\ 

the RLS estimate is strongly convergent to 6, i.e. 

lim 6{t) = 9 a.s. (6.4-16) 

Problem 6.4-3 Prove that, if < pi < p2 < oo, (15a) and (15b) are implied by the following 
persistent excitation condition 

1 * 

pilng < lim - V (p(k - l)ip'(k - 1) < P2ln g a.s. (6.4-17) 
* k=i 

but not vice versa. In particular note that asymptotic boundedness of t" 1 5Zfc=i <p(k—l)ip'(k—l) is 
a stability condition whereby the input u(t), possibly determined through feedback from {y(k), k < 
t}, stabilizes the system (14). 

Proof As in (2a), let 0(t) := 0(t) - 9. Next, as in (3), define V(t) := 8' (t)p- 1 (t)0(t). Denoting 
Tr[P _1 (t)] by r(t), from (13g) it follows that 

r(t) = r(t- 1) + \\<p(t- 1)|| 2 (6.4-18) 

with r(0) = Tr[P (0)] > 0. The proof is based on the following two key results 

lim < oo a.s. (6.4-19) 

t^«> r(t) 

gM«-i)ll a m <00 a . s . (6 . 4 . 20) 

We first show how (19) and (20) can be used to prove the theorem and, next, we derive them via 
a stochastic Lyapunov function argument. 
Eqs. (19) and (20) imply that 

lim ^M. = a.s. (6.4-21) 
t^oo r (t) 

In fact, (20) can be rewritten as 

^ r(t) r(t - 1) 
Now, we show by contradiction that 

£ r(t) 7 r(f : 1} = oo (6.4-22b) 

On the contrary, suppose that the above is finite. Since (15a) implies that limt^oo r(t) = oo, 
Kronecker's lemma (Cf. Appendix D) yields 



, lim -?titEw*)-K*-i)] = o 

r{t - 1) ^ 
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Hence, since r(t — 1) < r(t) 



1 ' 

, lim "TTT E I r ( fe ) - r( - k - !)] 



lim 

t — *oo 



r(0) 
r(t)J 



This contradicts limt^oo r(t) = oo. Therefore, (22b) holds. Then, (21) follows from (19), (22a) 
and (22b). Now from the definition of V(t), 

v(t) A min [p-i(t)] ||e»|| 2 



r(t) 



r(t) 
A min [P- l {t) 



~ n„A max [P- l {t)\ 
Prom (21) and the above, using (15b), we have 

lim \\e(t)\\ 2 = a.s. 

t— >oo 

and (16) follows. 

We now proceed to prove (19) and (20). We do it in two steps. 

(a) Calculation of V(t) From (3-13) and (14) we get 

9(t) - Pit - l)tp(t ~ l)r?(t) = §(t - 1) 

where 7f(t) denotes the a posteriori error 

r)(t) = y(t)-<p'(t-l)0(t) 
= -¥>'(* -l)0(t) + e(t) 
sit) 



(6.4-23) 



! + ¥>'(*- l)P(t- !)¥>(*- 1) 



and e(t) the a priori error e(t) := j/(t) - p'(t - 1)0 (t - 1). Setting 6(t) := -ip'(t - l)0(t), from 
(23) wc find 

V(t — 1) = e'(t)p- 1 (t-l)e(t) + 26(t)»;(t)+ ¥ )'(t-l)P(t-l) V (t-l)»; 2 (t) 

= y(t)-6 2 (t) + 26(t)^(t)+^(t-l)P(t-l) V (t-l)r, 2 (t) [(13g)] 

and recalling that r;(t) = b(t) + e(t) 

V(t) = y(i - 1) - fe 2 (t) - 2b(t)e(t) - tp'it - l)P(t - l)<p(t - i)v 2 (t) 

Taking conditional expectations w.r.t. the a— field Pi— i gives 



£{V(t) | Pt-i} = V(t - 1) - £" {fc 2 (t) I P-i} + 2 V '(t - l)P(t)<p(t - l)c 
£{<p'it - l)P(t - l) v (t - l),, 2 (t) | Ft-i) 



(6.4-24) 



Eq. (24) is obtained by using the following properties: Oit) £ Pt, where this notation indicates 
that Oit) is P t -mcasurable; P(t) G Pt-i by (13g); and (C/. Problem 4 below) £ {6(t)e(t) | Pt-i} = 
-¥>'(t-l)P(t)¥>(t-l)er 2 . 



(b) Construction of a stochastic Lyapunov function Define 



Xit) 



vjt) EUibHk) ^ 

rit) rit - 1) ^ 



tp'ik - l)P(fe - lMk - 1) 2 



r(fc - 1) 



+ 



E 



IMfc-l)|| 2 Vik) 



r(k - 1) r(fc) . 



(6.4-25) 



By using (24), wc show that X(t) is a stochastic Lyapunov function, in that it is a positive process 
and 

£ { X { t) | P t _ l} < *(t - 1) + ^'ft-DfWf-D a . s . (6 . 4 _26) 

r(t- 1) 
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Since (Cf. Problem 5 below) 

£ ^*-WM*-i) < „ a . s . (6 . 4 _ 27) 

by virtue of (26) we can apply the Martingale Convergence Theorem (Theorem D.5-1) to conclude 
that {x(t),t g W,} converges a.s. to a finite random variable 

lim X(t) = X < oo a.s. (6.4-28) 

t — >oo 

In particular, since all the additive terms in (25) are nonncgativc, (19) and (20) follow at once. 
To prove (26) we take conditional expectations w.r.t. the c— field Tt— l of every term in (25) 

£{x(t) i F t -i} = £m^ + £m^} + g ^ . 



(t) fl(t - 1) ^ r(fc - 1) 

g V(t - l)P(t - 1M< - l)q 2 (r.) | ^ t -i} 



r(t-l) 

^ y'(fc-l)P(fc-l)g(fc-l) 2 + 

||y(t-l)H 2 g{V-(t) |^_i> ^ l|y(fc-l)j| 2 V(fe) 
r(t-l) r(t) ^ r(fc-l) r(fc) 

Since ^ [l + "^'V)" 2 ] = Tf^IJ. usi °S ( 24 ) wc § ct 

fWrt|f , V(t-1) ^ ^(fc-l)P(fe-lMfc-l) _ 2 

^ ||^(fc-l)|| 2 2 y'(t-l)P(t)y(t-l) ^ 
^ r(fc-l) r(k) r(t-l) 

Hence (26) holds with the equality sign. 

Problem 6.4-4 Consider the RLS algorithm (3-13) applied to the data generated by the ARX 
system (14). Let b(t) := -<p'(t - l)9(t). Show that 

£ {b(t)e(t) | F t -i} = <p'(t - 1)P(*M* - l)^ 2 



Problem 6.4-5 Prove the existence of the bounded limit in (27). [Hint: Show that the 

'fc=l r(fc-l) 



nonnegative partial sum — ^fk-lt^ — ~ * s dominated by the monotonic nonincreasing 



sequence £ fc =1 Tr [Pfjfc - 1) - P(jfc)] = TrP(O) - TrP(JV+l)]. ] 

To estimate the parameter vector 6 of (14), it is instructive to consider instead 
of the RLS algorithm (3-13), the off line Least Squares algorithm 



9{t) = iT 1 ^)^ </?(£;- l)y(jfe) (6.4-29a) 
fe=i 

t 

: = ^ip{k-\)<ft{k-\) (6.4-29b) 



fc=i 



which can be seen to minimize the criterion (3-15) for P 1 (0) = 0. For such an 
algorithm we can prove that lirn t _ +00 0{t) = 9 a.s. provided that 

lim X mi n[R(t)} = oo a.s. (6.4-30a) 



t— »oo 



R ® > pine > a.s. (6.4-30b) 



Tr[R(t)] 
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Note that -P -1 (£) reduces to R(t) under the initialization P _1 (0) = 0. Hence, (30a) 
is a persistent excitation condition, whereas (30b) is similar to (15b). It is easy to 
see that (30) are implied by (17) but not vice versa. The strong consistency proof 
of (29) under (30) can be carried out [KV86] via a martingale convergence theorem, 
similarly, but in a somewhat more direct fashion, to the proof of Theorem 3. 

In [LW82] it was proved via a direct analysis that RLS strong consistency is 
still guaranteed if the conditions (15) can be relaxed as follows 

hm — . = oo a.s. 6.4-31 

t-oologAmaxI-P" 1 ^)] 

Further, (31) follows from (30a) and 

hm - = oo a.s. (6.4-32) 

According to [LW82], (30a) and (32) make up in some sense the weakest possible 
condition for establishing RLS strong convergence for possibly unstable systems 
and feedback control systems with white noise disturbances. 

6.4.3 RELS Convergence Results 

We consider the RELS(PO) algorithm (3-26b), (3-26d)-(3-27b) under the assump- 
tion that the data satisfy the ARMAX model 

A(d)y(t) = B(d)u(t) + C{d)e{t) (6.4-33a) 

or 

y(t) = <p' e (t- l)0 + e(t) (6.4-33b) 

with ip e (t — 1) the "true" parameter vector 6 as in (3-26a). The stochastic as- 
sumptions are as follows. All the involved processes, as well as </? e (0), are defined 
on an underlying probability space (ft, T, P). Tq is defined to be the cr-field gen- 
erated by {<p e (0)}. Further, for all t e TL\ T t denotes the cr-field generated by 
{(p e (0),z(l),---,z(t)}, z(k) := [ y(k) u(k) ] ', or equivalently {<p e (0), </? e (l), • • • , fe 
Consequently, T C T t C Tt+u t ^7L\. We have also for the process e 

£ {e{t) | T t -i} = a.s. (6.4-33c) 

£ {e 2 (t) | T t -i) = (J 1 a.s. (6.4-33d) 

1 N 

limsup — e 2 (fc) < oo a.s. (6.4-33e) 

fe=i 

In order to state the desired result we need an extra definition. Given apxp matrix 

H(d) of rational functions with real coefficients, we say that H(d) is positive real 
(PR) if 

H(e^) + H'ie-*") > 0, u> G [0, 2tt) (6.4-34) 

H{e lul ) is said to be strictly positive real (SPR) if the above is a strict inequality. 
Note that for p = 1 (34) becomes Re[H(e tu ')} > where Re denotes "real part". 
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Figure 6.4-1: Polar diagram of C(e tuJ ) with C(d) as in (37). 



Theorem 6.4-4. (Strong Consistency of RELS(PO)) Consider the RELS(PO) 
algorithm (3-26b), (3-26d)-(3-27b) applied to the data generated by the ARMAX 
system (33). Assume further that the following conditions hold: 

(1) (Stability condition) det[C(rf)] is a strictly Hurwitz polynomial; 



(2) (Positive real condition) -^-^r — | is SPR; 



(3) (Persistent excitation) The sample mean limit of the outer products of the pro- 
cess if e exists a.s. with 



1 N 

p x I ne < Jim — V (f e (k - l)<p' e (k - 1) < paint 



(6.4-35) 



fc=i 

with < pi < p2 < oo. 
Then, the RELS(PO) estimate is strongly convergent to 9, i.e. 

lim 9(t) = 6 a.s. 

t — >oo 

Problem 6.4-6 Let C(d) be a polynomial. Show that for u> 6 [0, 2ir) 

1 11 



\C(e w ) -1| < 1 



C(d) 



is SPR C(d) is SPR 



(6.4-36) 



Note that (36) indicates that the SPR condition in Theorem 4 amounts to assuming that (33a) 
is not too far from the ARX model A(d)y(t) = B(d)u(t) + e(t) with A(d) and B(d) as in (33a). 



Example 6.4-1 Consider the ARMAX model (33a) with 

A(d) = 1 + d + 0.9d 2 
B(d) = 

C(d) = 1 + l.5d + 0.75d 2 



(6.4-37) 



Fig. 1 depicts the polar diagram of C(e l "). We see that C(d) is not SPR and this, in turn, implies 
i is not SPR. If the RELS(PO) algorithm with no input data in the pseudoregressor 



that 



C(d) 



(27b) is applied to the data generated by the ARMA model (37) we get the results in Fig. 2. This 
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Figure 6.4-2: Time evolution of the four RELS estimated parameters when the 
data are generated by the ARMA model (37). 
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shows the time evolution of the four components of 6(t) = [ ai(t) <J2(t) ci(t) c 2(t) ]'■ We 
see from Fig. 2 that the algorithm attempts to reach the true values. However, convergence is not 
achieved in that when the estimates come close to the optimal ones, they arc pushed away and 
keep on bouncing below the true values of the parameters. 

For a proof of Theorem 4 the reader is referred to pp. 556-565 of [Cai88]. This proof 
follows similar lines as the ones of Theorem 3 with extra complications arising from 
the presence here of the C(d) innovations polynomial. As for the RLS algorithm, a 
direct approach was presented in [LW86] which allows one to replace the persistent 
excitation condition (35) by the weaker condition (31) provided that P(t) is as in 
(3-26e) with (p(t) as in (3-27b). Eq. (31) is in turn implied by 

lim A m ; n [i? e (i)] = 00 a - s - 

and 

A m i n [R e (t)] 

lim - = oo a.s. 

t^oo logA max [R e (t)\ 

where R e (t) := J2k=i <Pe(k - l)p' e {k - 1). 

For a discussion of the strong consistency of a variant of RELS(PO) where the 
pseudo-regressor vector used in the algorithm is obtained by filtering the one of 
RELS(PO) by a fixed stable filter l/D(d), the reader is referred to [GS84]. Note 
that this identification method resembles, and has its justification in, the RML 
algorithm (3.28). Though strong consistency is proved for such a case only under 
the restrictive assumption that A(d) is strictly Hurwitz, it satisfies our intuition to 
see that the SPR condition of Theorem 4 is modified as follows 

Even if — \ in not SPR, the above condition can be satisfied by choosing D(d) 
close to C(d) provided that the latter can be guessed with a good approximation. 

Main points of the section The Lyapunov function method and its stochastic 
extension based on martingale convergence theorems, can be used to prove de- 
terministic convergence and, respectively, stochastic strong consistency of the RLS 
algorithm. The crucial conditions which must be satisfies to this end are: the model 
matching condition, viz. that the true data generating system belongs to the model 
set parameterized by the vector to be identified; and the inputs satisfy appropriate 
persistent excitation conditions. 

Strong convergence of pseudo-linear regression algorithms, e.g. RELS (PO), 
requires strong additional assumptions, such as mean-square boundedness of the 
involved signal and satisfaction of a strict positive real condition. 

While convergence analysis of recursive identification algorithms is highly in- 
structive to understand their potential, it falls short for adaptive control where the 
above mentioned conditions are usually not satisfied or cannot be guaranteed. 

Notes and References 

Prediction problems of stationary time series were independently and simultane- 
ously considered by Kolmogorov [Kol41] and Wiener [Wie49] for the discrete-time 
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and, respectively, the continuous-time parameter case. The first used Wold's idea 
[Wol38] of representing time series in terms of innovations. Later [WM57], [WM58], 
Wiener used the Hilbcrt space framework of Kolmogorov for addressing the problem 
for the multivariate stationary discrete-time processes. In 1960, Kalman [Kal60b] 
presented the first recursive solution to the nonstationary prediction problem for 
discrete-time processes represented by stochastic linear state-space models. The 
solution for the analogous problem with continuous-time parameter was given in 
[KB61]. An informative survey of the development of the subject is [Kai74]; see also 
[Kai76]. The literature on Kalman filtering is now immense, e.g.: [AM79]; [Gcl74]; 
[Jaz70]; [May79]; [Med69]; [S0I88]; [Won70]. For the delicate issues of robustified 
implementations of the Kalman filter via matrix factorizations, see [Bie77]. 

The problem of parameter estimation, and the associated topics of biasedness, 
consistency, efficiency, maximum likelihood estimators, are well covered in books 
of statistics, e.g.: [KS79], [Cra46] and [Rao73]. In [GP77], [Lju87] and [SS89] 
these concepts are applied in the identification of linear systems. See also [B J76] , 
[Cai88], [Che85], [CG91], [Eyk74], [GS84], [HD88], [Joh88], [KR76], [Lan90], [LS83], 
[MG90], [ML76], [Men73], [Nor87], [TAG81], and [UR87]. For a Bayesian approach 
see [Pet81]. The use of prediction models in stochastic modelling and the inter- 
pretation of the RLS and RML algorithms as prediction error methods have been 
emphasized in [Cai76] and [Lju78]. 

The RELS method was first proposed by a number of authors: [ABW65], 
[May65], [Pan68] and [You68]. The RML was derived in [S6d73]. There is a vast 
literature on how to implement recursive identification algorithms via robust nu- 
merical methods [Bie77], [LH74]. See also [KHB+85] and [Pet86]. 



CHAPTER 7 



LQ AND PREDICTIVE 
STOCHASTIC CONTROL 



The purpose of this chapter is to extend LQ and predictive receding-horizon control 
to a stochastic setting. In Sect. 1 and Sect. 2 we consider the LQ regulation prob- 
lem for stochastic linear dynamic plants when the plant state is cither completely 
or only partially accessible to the controller. Stochastic Dynamic Programming is 
used to yield the optimal solution via the so-called Certainty-Equivalence Princi- 
ple. In Sect. 3 two distinct steady-state regulation problems for CARMA plants are 
considered. The first consists of a single step regulation problem based on a perfor- 
mance index given by a conditional expectation. The second adopts the criterion 
of minimizing the unconditional expectation of a quadratic cost. Both problems 
are tackled via the stochastic variant of the polynomial equation approach intro- 
duced in Chapter 4. Sect. 4 discusses some monotonic performance properties of 
steady-state LQ stochastic regulation. Sect. 5 deals with 2-DOF tracking and 
servo problems. The relationship between LQ stochastic control and Hoc control 
is pointed out in Sect. 6. Finally, Sect. 7 extends SIORHR and SIORHC, two 
predictive receding-horizon controllers introduced in Chapter 5, to steady-state 
regulation and control of CARMA and CARIMA plants. 

7.1 LQ Stochastic Regulation: Complete State In- 
formation 

The time evolution of the state x(k) of the plant to be regulated is represented here 
as follows 



where k £ [t ,T), x(k) £ R™, u(k) £ R m , £(fc) £ R", u(k) is the manipulated input 
and £(k) an inaccessible disturbance. The initial state x(t ) and the processes £ 
and u are defined on the underlying probability space (f2,.F, P). We consider a 
nondecreasing family of sub-a-ficlds {J^}^ , Tt a C • • • Tk C Tk+i C T, such 
that x(t ) £ T ta , £(fc) £ Tk- Here we use the shorthand notation v £ T to 
state that v is ^-measurable. Note that if we let Tk be the a field generated by 



x(k + 1) = ®{k)x{k) + G(k)u(k) + C(fc) 



(7.1-la) 



{*(toU(to), ■■■,£(*:)} 



T k := <r{x(t ),^(t ) 
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then {FkYkJto nas tne stipulated properties. Further, the disturbance £ has the 
martingale difference property 

£ {£0) I -Ffc-i} = O p a.s. \ f7l1M 

£ mW(k) | J- fe _!} = tt £ (fc) < co a.s. / (IA - ib > 

and 

f U(i o y (to)} - O px „ (7.1-lc) 

Note that no Gaussianity assumption is used here and that (lb) implies that £ is 
zero-mean and white. 

We next elucidate the nature of the process u. In the present complete state 
information case, u(k) is allowed to be measurable w.r.t. the a-field generated by 
Ik 

u(k)ecr{l k } (7.1-2) 

where Ik ■= {x k ,u k ~ x } , x k := {ar(i)} i=t . In words, u(k) can be computed as 
a function of the realizations of Ik- Eq. (2) specifies the admissible regulation 
strategy. Note that the strategy (2) is nonanticipative or causal, in that u(k) can 
be computed in terms of past realizations of u, and present and past realizations 
of x. 

We consider the following quadratic performance index 

£ { J (t , x(t ), u [t0tT) ) }=sl^2t{k, x(k),u(k))\ (7.1-3a) 

lfe=t J 

e(k,x(k),u(k)) ■= \\x(k)\\l Ak) +2u'(k)M(k)x(k) + \\u(k)\\l u{k) > (7.1-3b) 

e(T,x(T),u(T)) ■= \\x(T)\\ 2 M T) > (7.1-3c) 

For the properties of the matrices ip x (k), i> u (k) and M(k), the reader is referred to 
Sect. 2.1. We wish to consider the following problem. 

LQ Stochastic (LQS) regulator with complete state information 

Consider the stochastic linear plant (1) and the quadratic performance in- 
dex (3). Find an input sequence u® to t) to the plant that minimizes the 
performance index among all the admissible regulation strategies (2). 

We tackle the problem via Stochastic Dynamic Programming [Bcr76], [BS78]. This 
is the extension to a stochastic setting of the Dynamic Programming technique 
discussed in Sect. 2.2. 

For t € [toiT]) w e introduce the Bellman function 

V(t,x(t)) := min £ { £ (k , x(k) , u(k)) \ l t 1 
U[t < T) [ft J 

= min £ J J2i(x,x{k),u{k)) \ x(t) I (7.1-4) 



*[t,T) 



,fc=t 



where the last equality follows since, being the plant governed by the Markovian 
stochastic difference equation (la), the conditional probability distribution of future 
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plant variables, given T t , depends on x(t) only. This consideration leads us to 
conclude that the optimal input u° to T ^ does indeed satisfy the following admissible 
regulation strategy 

u(k) € cr{x{k)} (7.1-5) 

I.e., the optimal regulation law is in a state-feedback form. For any t\ G [t, T), we 
can write 



V(t,x{t)) = min £{ V ^ (fe, a;(fe), u(fc)) + 
min £ <^ \ £ (jfe 

UIti - T) Ife J J 

111111 £ \ X! ^ ( fc ' U ( fc )) + ^ ^l)) X(t) I 



M [t,tl) 



(7.1-6) 



fe=t 



The last equality follows since, by the smoothing properties of conditional expec- 
tations (Cf. Appendix D), 

min £ J V I (k,x(k),u(k)) x(t)\ = 

M[t - T > Ife J 

= min £ i £ \ V £(k,x(k),u(k)) x(ti),x(t)\ x(t)\ 
u ^ I life J J 

= min£(v(ti,a;(ti)) a;(t)| 

«[t,T) L J 

Setting ti = t + 1 in (6), we get the stochastic Bellman equation 

V(t,x(t))=mm\l(t,x(t),u(t))+£\v(t+l,x(t+l)) x(t)}\ (7.1-7) 
u(t) L L J J 



u(t) 

with terminal condition 



K(T,x(T)) =£(T,x(T),u(T)) = ||x(T)||^ (T) 



(7.1-8) 



The last two equations correspond to (2.2-8) and (2.2-9) in the deterministic setting 
of Sect. 2.2. The functional equation (7) can be used as follows. For t = T — 1, it 
yields 



V (T — l,x(T — 1)) = min it (T — l,x(T — l),u(T — 1)) + 

u(T— 1) I 

f{||*(T - !)- T ( T - 1) + G ( T - l ) u ( T ~ 1) + 

^-i)llL(T) |*(r-i)}} 

This can be solved w.r.t. u(T — 1), giving the optimal input at time T — 1 in a 
state-feedback form 

u°(T-l)=u° (T-l,a;(T-l)) 
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and hence determines V (T — l,x(T — 1)). By iterating backward the above pro- 
cedure, we can determine the optimal control law in a state-feedback form 

u°(k) = u°(k,x(k)) , ke [t ,T) 

and V(k,x(k)). The next theorem verifies that the above procedure solves the 
LQSR-CSI problem. 

Theorem 7.1-1. Suppose that {V(t, x)}J =to satisfies the stochastic Bellman equa- 
tion (7) with terminal condition (8). Suppose that the minimum in (7) be attained 
at 

u(t)=u(t,x) te[t ,T) 

Then U[t ,T) minimizes the cost £ { J (t, x(to), «[t ,T)) } over the class of all state- 
feedback inputs. Further, the minimum cost equals £ {V (t , x(t ))}. 

Proof Let u(t, x(t)) be an arbitrary feedback input and x(t) the process generated by (1) with 
u(t) = u(t,x(t)). We have 

T-l 

V(to,x(to))-V(T,x(T))= £ [V(t,x(t))-V(t + l,x(t + l))] (7.1-9) 

t=T 

By the smoothing properties of conditional expectations, we obtain 
£ {V(t, x(t)) - V(t + 1, x(t + 1))} = 

= £<£< V(t, x(t)) - V(t + 1, x(t + 1)) x(t) 



= £|V(t,x(t)) -£|y(t + l,x(t + l)) x(t)j 
<£{£(t,x(t),u(t))} [(7)] (7.1-10) 
Tacking the expectation of both sides of (9), we get 

{T-l 
[V(t,x(t))-V(t + I,x(t + 1))] 
t=t 

< e\ ^ e(t,x(t)Mt))\ [(io)] (7.1-n) 
[t=t ) 

Hence, from (8) it follows that 

£{V(to,x(to))} <£{J(to,x(t ),u [t0}T) )} (7.1-12) 

Conversely, the same argument holds with equality instead of inequality in (11) when u(i) = u(t). 
Consequently, 

£ {V (t ,x(t )} = £ {J (t ,x(t ),u [t(uT) )} (7.1-13) 
Prom (12) and (13) it follows that W[t 0i T) is optimal and that the minimum cost equals £ {V(*o, x(to))}. 

The solution of the stochastic Bellman equation (7) is related in a simple way to 
that of its deterministic counterpart (2.3-7). In fact, in the present stochastic case 
we have 

V(t,x) = x'V(t)x + v(t) (7.1-14) 

with V{T) = ip x (T) and v(T) = 0. Assuming (14) to be true, the induction 
argument, as used in the deterministic case of Theorem 2.3-1, shows that 

V(t -l,x)= x'V{t - l)x + v(t) + Tr [7>(i)*c(t - 1)] 
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with V(t) given by the Riccati backward iterations (2.3-3)-(2.3-6). Further, since 
v(T) — 0, working backward from t = T we find that 



T-1 



v (t) = ^Trp>(fc + l)* € (Jfe)] (7.1-15) 
k=t 

We observe that v(t) is not affected by U[t ,T)- Hence, the optimal inputs obtained 
by (7) are given by (2.3-1) as if the plant were deterministic, i.e. £(t) = O n . 
Summing up we have the following result. 

Theorem 7.1-2. (LQS regulator with complete state information) 

Among all the admissible strategies (2), the optimal input for the LQS regulator 
with complete state information is given by the linear state-feedback regulation law 

u{t) = F(t)x(t) t e [to, T) (7.1-16) 

In (16) the optimal feedback-gain matrix F(t) is the same as in the deterministic 
case (£(t) = O n ) of Theorem 2.3-1 and given by (2.3-2) in terms of the solution 
V(t+ 1) of the Riccati backward difference equation (2.3-3)-(2.3-6). Further, the 
minimum cost incurred over the regulation horizon [t, T] for the optimal input se- 
quence U[t,T)> conditional to the initial plant state x{t), is given by 



V(t,x(t)) 



= min £ t{k,x{k),u{k)) \ x(t))\ 

u[ ^> life J 

T-1 

= x'{t)V{t)x{t) + J2 Ti: [F{k+l)^ 6 { k )] (7.1-17) 

k=t 

Problem 7.1-1 Using the induction argument, prove (17) and (16). 

Problem 7.1-2 Taking into account (17), show that the minimum achievable cost over [t, T) 
equals 

min £ {J (t,x(t),u [tiT) )} = (7.1-18) 

u [t,T) 



£{V(t,x(t))} 

\\£Mt)}\\v(t) + Tr mt) Cov(x(t))] + £ Tr [P(k + l)* e (fc)] 



T-1 



k=t 



[Hint: Use Lemma D.l of Appendix D. 



Notice that in (18) the first two terms depend on the distribution of the initial 
state, while the third is due to the disturbance £ forcing the plant (la). 

Main points of the section For any horizon of finite length and possibly non- 
Gaussian disturbances the LQS regulation problem with complete state information 
is solved by a linear time-varying state-feedback regulation law which is the same 
as if the plant (la) were deterministic, i.e. £(fc) = O n . 



Problem 7.1-3 Consider the plant given by the SISO CAR model (Cf. (6.2-69b)) 

A{d)y(k) = B{d)u(k) + e(k) (7.1-19) 
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with the polynomials A(d) and B(d) as in (6.1-8). Let s(k) be the vector 

«(*):= f (y k k - na+1 )' (u^'leR""^- 1 (7.1-20) 



Then (Cf. Example 5.4-1), (19) can be written in state-space form as follows 



s(fe + l) = ^s(k) + Guu(k) + Ge(k - 
y(k) = Hs(k) 



" } 



(7.1-21) 



where (<I>, G u , H) are defined in Example 5.4-1 and G = e n ^ . For 

£(fc) :=e(fc + l) 

and 

:=<r{x(t ),£(io), •••,£(&)} ^to-l :={0:^} 
assume that (lb) and (lc) are satisfied. Consider the cost 

e{j(to,8{to),u [t0 ,t))}=e\ E [y 2 (k) + P u 2 (k)}\ (7.1-22) 

p > 0, and the admissible regulation strategy 

u(k) 6 <T{s(to),J/ fc ,t4* _1 } (7.1-23) 

Show that the problem of finding, among all the admissible inputs (23), the ones minimizing (22) 
for the plant (19), is an LQS regulation problem with complete state information. Further, specify 
suitable conditions on A(d) and B(d) which guarantee the existence of the limiting control law 
u(t) = Fs(t) as T — ► oo. Compare the conclusion with those of Problem 2.4-5. 



a.s. 



£{v{k)v'{k) | .F fc _!} = *„(fc) 



O nxp 
* c (fc) 



o 



pxn 



a.s. 



(7.2-la) 



7.2 LQ Stochastic Regulation: Partial State In- 
formation 

7.2.1 LQG Regulation 

We shall refer to the plant as the combination of the system (1-la) to be regulated 
along with a state sensing device which makes available at every time k an obser- 
vation z(k) of linear combinations of the state x(k) corrupted by a sensor noise 
C(k): 

x(k + l) = $(k)x(k) + G(k)u(k) + £(fc) 1 
z(k) = H(k)x{k)+({k) j 

with x, u, £, ( defined on the probability space (ft, J 7 , P), 

x(k),f(k) e IT, u(jfc) G M m , z(fc),C(fc) e R p 
and all matrices of compatible dimensions. Let 

v(k):=[?(k) C'(fc) ]' 
Define the family of sub-cr-fields {Fk}kZt as follows 

T k := a {x(t ),v(t ), ■ ■ ■ v{k)} T^-i := {0, 
and assume that v has the martingale difference property 
£ {u(k) | T k -i} = O n+p 



(7.2-lb) 
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£Mt )x'(to)} = {n+p)xn (7.2-lc) 

and 

x(to) and {v(k)}^~t are jointly Gaussian distributed (7.2-ld) 

Here, for the sake of simplicity, we have taken the cross-covariance between £(fc) 
and C(fc) to be zero at any instant k. 

In the present partial state information case, the admissible regulation strategy 
allows u(k) to be measurable w.r.t. the cr-field generated by 
{z k ,u k - 1 }, z k := {z(i)} k =ta 

u{k) G a{z k ,u k - 1 } = a{z k } (7.2-2) 

Note that by (1-la) a {z k } C T k . 

Following the lines of the previous section, we take the performance index to be 
£ { J (to,x(to),U[ tot T))} as in (1-3) and consider the following problem. 

LQ Gaussian (LQG) regulator Consider the linear Gaussian plant (1) 
and the quadratic performance index (1-3). Find an input sequence uj^ T ^ 
to the plant that minimizes the performance index among all the admissible 
regulation strategies (2). 

By the smoothing properties of conditional expectations we can rewrite the perfor- 
mance index as follows 

£{j(to,x(to),U[to,T))} =£ { J2 £ {e(k,x(k),u(k)) \z k }\ (7.2-3) 

Further, recalling (l-3b), (l-3c) and Lemma D.2-1 of Appendix D, 

£ {£(k,x(k),u(k) z fc } = 

= t (k, x(k | k),u(k)) + Tr [^ x (k) Cov (x(k) \ z k )] (7.2-4) 

Here x(k \ k) denotes the conditional expectation 

x (k | k) =£{x(k) | z k ) 

Thanks to the Gaussianity assumption (Id), by virtue of Fact 6.2-1, this is given 
by the following Kalman filter formulae (6.2-46), (6.2-47) 

x(k\k) = x(k | k - 1) + G(k)u(k) + K(k)e(k) [(6.2-46)] (7.2-5) 
x(k + 1 | jfe) = $(k)x(k | k) + G(k)u(k) (7.2-6) 

K(k) = U(k)H'(k)\H(k)U(k)H'(k) + ^ c {t)] 1 [(6.2-22a)] (7.2-7) 

with e(k) — z(k) — H(k)x(k \ k — 1) and n(fc), the state prediction error covari- 
ance, satisfying the forward Riccati (filter) recursion (6.2-26). The above filtering 
equations are to be initialized from 

x(t | t - 1) = £{x{t )} and U(t ) = Cov(x(t )). 
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Further, recalling (6.2-31), we have 

n(fc | k) := Cov(x(k) | z k ) (7.2-8) 

= U(k) -U(k)H'(k)\H(k)U(k)H'(k) + *c(t)l H(k)U(k) 

Taking into account the above considerations, (3) becomes 

£{j(t,x(t ),u [totT) )}= (7.2-9) 



£ J ^ \l{k, x(k I k), u{k)) + Tr ^(fc)n(/c | k) 



k=t 

Now U(k | /c) can be precomputed, being only dependent on Cov(a:(io)), and the 
ipx^ty's are given weighting matrices. Thus, minimizing (9) w.r.t. U[t ,T) under the 
admissible regulation strategy (2) is the same as minimizing 

£lj2e(k,x(k\k),u(k))\ 

Kk=t ) 

w.r.t. M[t .T) f° r the plant 

x{k+l\k + l) = <I>x(k | k) + G(k)u(k) + K(k + l)e(k + 1) 

with complete state information. In particular, note that (1-lb) and (1-lc) are 
satisfied for £(fc) changed into e(k + 1). The theorem that follows is an immediate 
consequence of the above considerations. 

Theorem 7.2-1 (LQG regulation). The optimal LQG regulation law under par- 
tial state information, viz. fulfilling the admissible regulation strategy (2), is given 
by the filtered-state-feedback law 



u(t) = F(t)x(t | t) t€[t ,T) 



(7.2-10) 



In (10): F{t) denotes the optimal feedback-gain matrix which is the same as in 
the deterministic LQR case (£(t) = O p ) of Theorem 2.3-1 and given by (2.3-2) in 
terms of the solution V(t+1) of the Riccati backward (regulation) difference equation 
(2.3-3)-(2.3-6); and x(t \ t) — £ {x(t) \ z*} is the Kalman filtered state provided by 
(5)-(7) whose optimal gain matrix K(t) is given by (7) in terms of the solution 
Il(i) of the Riccati forward (filtering) equation (6.2-26). Further, the minimum 
cost incurred over the regulation horizon [t,T] for the optimal input sequence U[ t t) 
is given by 



min J <t,x(t),u [t T) ) 

U [t,T) 



\\£{x(t)}\\ 2 v{t) +Tr[V(t)U(t\t)] 

T-l 

+!)**(*)] + 

k=t 
T 

^Tr[^(fc)n(fc|fc)] 



(7.2-11) 



k=t 



with H(k | k) as in (8). 
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Figure 7.2-1: The LQG regulator. 



Problem 7.2-1 By using (9) and (1-17), prove (11). 

It is instructive to compare (11) with (1-18). W.r.t. (1-18) the only extra term 
in (11) is its last summation. This depends on the posterior covariance matrices 
II(A: | k) — Cov(x(k) | z ) which are nonzero in the partial state information case. 

The LQG regulator solution is depicted in Fig. 1. It is to be pointed out 
that the computations involved refer separately to the regulation and state filtering 
problems respectively. In fact, computation of F(t) involves the "cost" parameters 
tpx(k), M(k), ipu(k), ip x (T) but not the "noise" parameters ^(fe), ^((k), and 
£{x(to)} and Cov(x(to)), whereas the converse is true for Kalman filter design. 
The regulator then separates into two parts which are independent: a filtering 
stage and a regulation stage. The filtering stage is not affected by the regulation 
objective and conversely This is the Separation Principle. 

The property that the optimal input takes the form u(t) = F(t)x(t \ t) where 
F(t) is the feedback-gain matrix as in the deterministic case or the complete state 
information case, is referred to as the Certainty-Equivalence Principle. This, in 
other words, states that the optimal LQG regulator acts as if the state filtered 
estimate x(t \ t) were equal to the true state x(t). 

It is interesting to pause so as identify the reason responsible for the validity of 
the Certainty-Equivalence Principle in the LQG regulation problem. In a general 
stochastic regulation problem with partial state information the input sequence 
U[ to ,t) has the so called dual effect [Fel65], in that it affects both the variables to 
be regulated and the posterior distribution of the plant state given the observa- 
tions. Under such circumstances, the optimal regulator may exhibit a separation 
property but not satisfy the Certainty-Equivalence Principle [Wit71], [BST74]. In 
this connection, consider a nonlinear plant composed by a system governed by the 
equation x(k + 1) = $(k)x(k) + G(k)u(k) + £(fc), and a possibly nonlinear sens- 
ing device giving observations z(k) = h(k,x(k),((k))- For the sake of simplicity, 
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assume that £ and ( have the same properties as in (1). Then, in [BST74] it is 
shown that the optimal regulator minimizing (3) fulfills the Certainty-Equivalence 
Principle if and only if the posterior covariance of the plant state x(t) given the 
observations z 1 is the same as if U[ to >t ) were the zero sequence, viz. the input has 
no dual effect on the second moments of the posterior distribution. In this respect, 
we note that in the LQG regulator solution the posterior uncertainty on the true 
state x(t), as measured by Cov(x(t) \ z l ) in (8), is not affected by U[ t0tt y In fact 
Gov(x(t) | z') can be precomputed being unaffected by the realization of z* and the 
specific input sequence U[ ta .t)- This means that the input has no dual effect and, 
hence, the Certainty-Equivalence Principle applies. 



7.2.2 Linear Non— Gaussian Plants 

Consider the linear stochastic plant (la). Again we write 

u(k) := [ m ('(k) ] 



(7.2-12a) 



In contrast with (Id), here we assume that the involved random vectors are not 
Gaussian. Nonetheless, their means and covariances are as before 



£{v(k)} = O n+p 



E{v(k)i/(i)} = ^ v (k)5 k!i = 



O 



pxn 



* c (fe) 



(7.2-12b) 



£{v{k)x'{t )} = (n+p)x „ 



(7.2-12c) 

In such a case, Theorem 6.2-2 states that the Kalman filtered state x(t \ t) is 
only the linear MMSE estimate of x(t) based on z l and not the conditional mean 
£{x(t) | z 1 } as in the Gaussian case. Further, Cov(e(i)), e(t) :— x(t) — x(t \ t), 
is still given by (8) also in the non-Gaussian case. Finally, by Lemma D.2-1, 
£ {||^(^)||^, x ( fc )} depends only on the mean and Cov(x(t)). These observations lead 
to the following result. 



Result 7.2-1. Consider the linear stochastic plant (la), (12a)-(12c). Assume that 
the involved random variables are possibly non-Gaussian and the cost is quadratic as 
in (1-3). Then, the optimal input sequence u® to T ^ to the plant (la) that minimizes 
(1-3) among all the admissible linear regulation strategies 



u(t) = /(t, zW" 1 ) 
f(t, ■, •) linear 



(7.2-13) 



is given by the linear feedback law (10) with F(t) and x(t \ t) computed as indicated 
in Theorem 1 as if the involved random vectors were jointly Gaussian distributed. 



Problem 7.2-2 Prove the conclusions of Result 1. 



7.2.3 Steady-State LQG Regulation 

Up to now the results obtained on LQ stochastic regulation have required no sta- 
tionariety assumptions on the involved processes nor time-invariant system matri- 
ces. However, as with the deterministic LQR problem of Chapter 2, an interesting 
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topic is the analysis of the limiting behaviour of the LQG regulator as the regulation 
horizon goes to infinity. 

Consider the time-invariant linear Gaussian plant 



x(k+l) = ®x{k) + G u u{k) + £(fc) 
y(k) = Hx{k) 
z{k) = H z x(k)+((k) 



(7. 2- 14a) 



with 



v{k):=[ m C(k) ]' 
a finite-variance wide-sense stationary Gaussian process satisfying (2-1) with 

tff(fc) = * 4 = GG' * c (jfe) = * c > (7.2-14b) 



where G has full column-rank, 
performance index 



Setting N := T — t e TL\, consider also the 




T-l 



J2Hy(k),u(k)) 



V-x(T) 



k=t 



(7.2-15a) 



:=||lC e + IM&, (7.2-15b) 

V y = ^>0 ^ u =<>0 ^ x (T)=Vi(T)>0 (7.2-15c) 

along with an admissible regulation strategy given by (2). We see that in (14a) 
y(k) represents an output vector to be regulated and z(k) the sensor output or 
observation at time k. We know from Theorem 1 that the solution of the LQG 
regulation problem for every N e 7L\ satisfies the Certainty-Equivalence Principle. 
We are now interested in establishing the limiting solution as TV — > oo and to — ► 
— oo. In the problem that we have just set up there are two system triples, viz. 
(&,G U ,H) and (&,G,H Z ). They concern the regulation and the state-filtering 
problem, respectively. On the other hand, we know that both problems admit 
limiting asymptotically stable solutions provided that 



arc both stabilizablc and detectable 



(7.2-16) 



Further, by the Separation Principle, the two limiting processes are seen to be non- 
interacting. These considerations make it plausible for the above problem to have 
the following conclusions. 



Result 7.2-2. (Steady— state LQG regulation) Consider the time-invariant 
linear Gaussian plant (14) and the quadratic performance index (15) with time- 
invariant weights. Let (16) hold. Then, as N — > oo and to — > — oo the LQG 
regulation law, optimal among all the admissible regulation strategies (2), equals 

u(t) = Fx(t | t) (7.2-17) 

Here F is the constant feedback-gain matrix solving the time-invariant deter- 
ministic LQOR problem (£(i) = O n and ((t) = O p ) as in Theorem 2.4-5 



F = -{i> u + G' u PG u ) 1 G' U P* 



(7.2-18a) 
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where P = P' > satisfies the regulation ARE 

P = $'P$ - <5>'PG U {i> u + G^PGuY 1 G' U P<5> + H'ip y H (7.2-18b) 
Further, x(t \ t) is generated by the steady-state Kalman filter as in Theorem 6.2-3 

(7.2-19a) 



x(t\t) = x(t | t - 1) + Ke(t) 
x(t + l\t) = <£>x(t | t) + G u u(t) 
e(t) = z(t) - H z x(t | t - 1) 

k = iiH' z (H z iiH' z + ^ c y 1 

where IT = IT' > satisfies the filtering ARE 

n = $n$' - <f>iw z (h z uh' z + * c ) 1 h z ii& + gg' 



(7.2-19b) 
(7.2-19c) 
(7.2-19d) 

(7.2-19e) 



Eq. (17)-(19) make the closed-loop system asymptotically stable. Hence, under 
the stated conditions, as N — > oo and to — » — oo all the involved processes become 
stationary and (17) minimizes the stochastic steady-state cost 



Tr \i) y m y + Vvig 



where 



= hm £{y(k)y'(k)} 

N—>oo 



= lim £{u(k)u(k)} 

N^oc 



(7.2-20a) 



(7.2-20b) 



Problem 7.2-3 Let x(t) := x(t) - x(t \ t - 1). Show that the control law (17)-(19) gives the 
closed-loop system 



x(t+ 1) 
x(t+ 1) 



$ + G„F -G u F(I n - KH Z ) 
O * - K_Hz 



z(t) 
x(t) 



G U FK 



C(t) 



(7.2-21) 



with iiT = <I>if. Hence, recalling (4.4-25) and (6.2-68), provided that ($,G„,fl z ) is controllable 
and reconstructible, the ^-characteristic polynomial XLQG of the LQG regulated system equals 



,„ dct £(cZ) detC7(ci) 
XLQGW = A^W) X delc(u) 



(7.2-22) 



Problem 7.2-4 (Linear possibly non-Gaussian plants in innovations form) Consider the linear 
time-invariant plant (14a) with £(fc) = Gf(fc), possibly non— Gaussian, satisfying (l-lb)-(l-lc). 
Let 

<I> — GHz be a stability matrix 

and 

("!>, G u , H) be stabilizable and detectable. 
Exploit the result in Problem 6.2-7 to show that as t —* oo the LQ stochastic regulator minimizing 
(1-3) among all the admissible regulation strategies (2), is again given by (17)-(19) with 

x(t | t) = *x(t - 1 | t - 1) + G u u(t - 1) + G [z(t - 1) - H z x(t - 1 | t - 1)] . 



Main points of the section The solution of the LQG regulation problem for 
partial state information is given by the state-filtered feedback law F(t)x(t \ t) 
where x(t | t) = £{x(t) \ z'} is generated via Kalman filtering and F(t) is the same 
as in deterministic LQ regulation. The steady-state LQG regulator is obtained 
by cascading the deterministic steady-state LQ regulator with the steady-state 
Kalman filter generating x(t | t). 
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7.3 Steady-State Regulation of CARMA Plants: 
Solution via Polynomial Equations 

In this section we consider various stochastic regulation problems for the CARMA 
plant 

A(d)y(k) = B(d)u(k) + C{d)e{k) (7.3-1) 

where e, the innovations process of y, will be assumed to be zero-mean, wide-sense 
stationary white with £{e(fc)e'(fc)} = ^ e > 0. Other additional requirements on e 
will be specified whenever needed. In connection with (1), we assume that 

• A~ 1 {d) [ B{d) C(d) ] is an irreducible left MFD; (7.3-2a) 

• C(d) is Hurwitz; (7.3-2b) 

• the gcld's of A(d) and B(d) are strictly Hurwitz. (7.3-2c) 

We point out that, by Theorem 6.2-4, (2b) entails no limitation, and (2c) is a 
necessary condition (Cf. Problem 3.2-3) for the existence of a linear compensator, 
acting on the manipulated input u only, capable of making the resulting feedback 
system internally stable. 

7.3.1 Single Step Stochastic Regulation 

Here we consider the CARMA plant (l)-(2) along with the following additional 
assumptions on e 

£ {e(k+ 1) | e k ) = O p a.s. (7.3-3a) 

£{e{k+ l)e'(fc+ 1) | e k ) = * e > a.s. (7.3-3b) 

As we shall see, the martingale difference properties (3) are sufficient for tackling 
in full generality the regulation problem we are going to set up. In particular, 
no Gaussianity assumption will be required. Further, we assume that u(k) has to 
satisfy the admissible regulation strategy 

u{k)^a{y k 1 u k - 1 }=a{y k } (7.3-4) 

Further, the performance index is given by the conditional expectation 

C = £{\\y(t + r)\\l y + \\u(t)\\lJy t } (7.3-5) 

In (5) the integer t G 7L\ equals the plant I/O delay 

t := ordB(d) > 1 

We consider the following problem which is the extension of the deterministic single 
step regulation of Sect. 2.7 to the present stochastic setting. 

Single Step Stochastic regulation Consider the CARMA plant (l)-(3) 
and the performance index given by the single step conditional expectation 
(5). Find an optimal regulation law for the plant which makes the resulting 
feedback system internally stable and in stochastic steady-state minimizes 
the performance index among all the admissible regulation strategies (2). 
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It is convenient to explicitly exhibit in (1) the delay r by introducing the polynomial 
matrix B (d) such that 

d T B (d) = B(d) (7.3-6a) 
B (d) = B T + B T+1 d +■■■ + B dB d aB - T (7.3-6b) 
Consequently, (1) becomes 

A(d)y(k) = B {d)u(k - r) + C(d)e(k) (7.3-6c) 

In order to tackle the regulation problem stated above we first consider the following 
prediction problem. 

MMSE r step ahead prediction Consider the CARMA plant (l)-(3) 
whose input u satisfies (4). Find a finite variance vector 

y{k + r | k) e cr{y k } 

such that, for every R = R' > 0, in stochastic steady-state 

£ { \\y{k + r)-y(k + r\ k)\\ 2 R ) < £ {\\y{k + r) - y{k)\\ 2 R ) (7-3-7) 

among all finite variance vectors y(k) € o {y k }- 

y(k + t | k) is called the MMSE r-step-ahead prediction of y(k + r). 

Let (Q T (d),G T (d)) be the minimum degree solution w.r.t. Q T (d) of the Dio- 
phantine equation 

C(d) = A(d)Q T (d) + d T G T (d) 
dQ T {d) < t - 1 

Then, (6c) can be rewritten as 

!/(fc + t) = Q T (d)C- 1 (d)B (d)u(k) + A- 1 (d)G T (d)C- 1 (d)A(d)y(k) + 

Q T (d)e(k + r) (7.3-9) 

and next theorem follows. 

Theorem 7.3-1. (MMSE t step ahead prediction) Consider the 
CARMA plant (l)-(3) whose input u satisfies (4). Let both u and y mean-square 
bounded. Then, provided that the transfer matrix 

Hy\ u , y (d) - [ Q T (d)C- 1 (d)B (d) A- 1 (d)G T (d)C- 1 (d)A(d) ] 

is stable, the MMSE r-step-ahead prediction of y equals 

y(k + r\k) = y(k + r)-Q T (d)e(k + r) (7.3-10) 
= Q T (d)C- 1 (d)B (d)u(k) + A- 1 (d)G T (d)C- 1 (d)A(d)y(k) 

where the polynomial matrix pair (Q T (d),G T (d)) is the minimum degree solution of 
the Diophantine equation (8). Further, the MMSE prediction error y(k + r \ k) is 
given by the moving average 

y(k + r\ k) := y(k + r) - y(k + r \ k) = Q T (d)e(k + r) (7.3-11) 



(7.3-8) 
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Proof Let y Q (k) be given by the R.H.S. of (10). Then 

e{\\y(k + T)-m\\ 2 R } = S{\\y {k)-m + QAd)e{k + r)\\l} 

= £ { \\y (k) - mfn} + Tr [R{Q T (d^ e Q*(d))] 

Hence y(k + r \ k) = yo(k). The last equality in the above equation follows since by the smoothing 
properties of conditional expectations 

£ { \\y (k) - y(k) + Q T (d)e(k + r)\\%} = 

= £{£{ \\yo(k) - y(k) + Q T (d)e(k + T )f R I y k ) } 

= £{\\y (k)-y(k)\\ R + Tr[R£{f(k + T)f'(k + T) | /}]} [Lemma D. 2-1] 

where f(k + t) := Q T (d)e{k + r) = QM k + r - i) if Q T (d) = J2i=i Qi d ' '■ Thc conditional 

expectation inside the trace operator equals 

t-1 T-l 
]T Q^{ e (fc + T-i)e'(fc + T-j)|j/ fe }Q; = Qi*eQi 
i,j = 1 i=l 

= (Qr{d)*eQ*Ad)). 

Remark 7.3-1 C(d) strictly Hurwitz guarantees stability of the transfer matrix 
Hy\u,y{d). For H y \ uy (d) stability however a necessary and sufficient condition is 
that the denominator matrices of the irreducible MFDs of H y \ uy (d) be strictly 
Hurwitz. Should C(d) be Hurwitz but not strictly Hurwitz, stability may still be 
obtained if all the roots of C(d) on the unit circle are cancelled in every entry of 
H y \ uy (d). Hence, in such a case, stability can be only concluded after computing 
H y \ Uiy (d). □ 

We now consider the minimization of (5) w.r.t. u(t) e a {y 1 }. We find 

C = £{\m + r\t) + y{t + T\t)\\l y + \\u{t)\\l u \y t } 

= \W + t\ t)\\l y + \\u(t)\\l u + Tr [^ y {Q T {d)*eQ*M)\ [(H)] 

Equating to zero the first derivatives of C w.r.t. the components of u(t) and setting 

L := Q^O)^- 1 ^)^) (7.3-12) 
= A-\0)B (0) [(8)] 



we find 

or, provided that 



L'i)yy(t + t | t) + V«u(t) = Om (7.3-13) 



tp u + L'tjjyL is nonsingular, (7.3-14) 

u(t) = -(^ u + L'-ipyLy 1 L'^ y [y{t + t | t) - Lu{t)] (7.3-15) 
It is instructive to specialize (15) to the SISO case where w.l.o.g. we can assume: 

i> y = l; i> u = P>0; A(0) = C(0) = l (7.3-16) 

Further, 

B (0) = b T [(6b)] 
Then, the Single Step Stochastic regulation law becomes 

P C{d) + b T Q T (d)B (d)} u(t) = -b T G T (d)y(t) (7.3-17) 
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Problem 7.3-1 Let the polynomials pC(d) + b T Q T (d)Bo(d) and G T (d) be coprime. Show that 
the d-characteristic polynomial Xcl{d) of the closed-loop system (1) and (17) equals 



Xd (d) = 7 



-?-A(d) + B (d) 



C(d) (7.3-18) 



7 = b T /(p + &?)• Compare this with (2.7-13) to conclude that C(d) plays the role of the charac- 
teristic polynomial of the implicit Kalman filter embedded in the dynamic compensator (17). 

Remark 7.3-2 From (18) it follows that in the generic case, closed-loop stability 
requires C(d) to be strictly Hurwitz. Should C'(d) be Hurwitz but not strictly Hur- 
witz, a comment similar to the one of Remark 1 applies here. Further, existence of 
steady-state wide-sense stationariety of the involved processes, as in the formula- 
tion of the Single Step Stochastic regulation problem, implicitly requires that (15) 
yields an internally stable closed-loop system. □ 

Theorem 7.3-2. (Single Step Stochastic regulation) Consider the 
CARMA plant (l)-(3), the single step performance index (5) and the admissi- 
ble regulation strategy (4). Let (14) be satisfied. Then, provided that (15) makes 
the closed-loop system internally stable, the optimal single step stochastic regulator 
is given by (15), which in the SISO case simplifies as in (17). In the latter case, 
the d-characteristic polynomial of the closed-loop system generically equals (18). 

The regulator (15) or (17) is also referred to as the Generalized Minimum-Variance 
regulator. Setting i[) u = O mxm or p = 0, the regulator (15) or (17) becomes the 
so-called Minimum- Variance regulator, which has to be intended as the regulator 
attempting to minimize the trace of the plant output covariance. 

As was to be expected in view of Sect. 2.7 the Single Step Stochastic regulator 
suffers (Cf. Problem 1) by the same limitations as in the deterministic case. In 
particular, for p = it is inapplicable to SISO nonminimum-phase plants, and 
may not be capable of stabilizing nonminimum-phase open-loop unstable plants 
irrespective of p. As seen from (18), in the stochastic case C(d), representing 
the dynamics of the implicit Kalman filter, is a factor of the resulting closed- 
loop characteristic polynomial and, hence, affects the robustness properties of the 
compensated system. 

Problem 7.3-2 (Single Step Stochastic Servo) Consider the CARMA plant (l)-(3), the perfor- 
mance index 

C = s{\\ V (t + r)-r(t + T)\\% y +\\u{r)\\% u \ v\r*+*} 
and the admissible control strategy 

«(t)e<7{yV*+ T } 

with r, the reference to be tracked, a wide-sense stationary process independent on the innovations 
e. The adopted control strategy amounts to assuming that the controller at time t has full 
knowledge of the reference realization up to time t + T. Find how (15) must be modified to solve 
the above Single Step Stochastic servo problem, provided that the resulting control law yields an 
internally stable feedback system. 

Problem 7.3-3 (Minimum Variance Regulator) Consider the regulation law obtained from (17) 
for p = and r = 1. Show that the resulting regulation law is the same as that obtained from 
(1) by forcing y to equal e. 

7.3.2 Steady-State LQ Stochastic Linear Regulation 

Hereafter, instead of the conditional expectation (5), we consider the minimiza- 
tion in stochastic steady-state of a performance index consisting of the following 
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unconditional expectation 

C = £{\\y(k)\\l v + \\u(k)\\l u } (7.3-19) 

According to Result 2-2, minimization of (19) is achieved by steady-state LQG 
regulation, whose Riccati-based solution is given by (2-17)-(2-19). The aim here is 
to solve the problem for CARMA plants via the polynomial equation approach, so 
as to obtain the stochastic extension of the results of Chapter 4. We shall proceed 
as follows. We tackle the problem by first finding the optimal linear regulator and, 
next, showing that this is also optimal among all nonlinear feedback compensators. 
This result does not require the plant to be Gaussian, being only sufficient that the 
innovations process e satisfy the martingale difference properties (3). 

Consider the plant (l)-(2) along an admissible regulation strategy which restricts 
the plant input to be given by a causal linear compensator with transfer matrix 
JC{d) 

u(t) = -JC(d)y(t) (7.3-20) 

making the resulting feedback system internally stable. The strategy (20), besides 
linearity of the regulator, is also restrictive in that the plant output y to be regulated 
coincides with the variable at the compensator input. In this respect, more general 
system configurations have been considered in [CM91], [HSK91] and [HKS92] at 
the expense of greater algebraic complications. For the sake of simplicity, we shall 
refrain from discussing these extensions by restricting ourselves to the following 
problem. 



Steady State LQ Stochastic Linear (LQSL) Regulator Consider the 
CARMA plant (l)-(2) and the quadratic performance index (19). Find, 
whenever it exists, a linear feedback compensator (20) which makes the 
closed-loop system internally stable and minimizes (19). 



According to Theorem 3.2-2, the above stability requirement is equivalent to state 
that JC(d) is factorizablc in terms of the ratio of two stable transfer matrices M 2 (d) 
and N 2 (d), or Mi(d) and Ni(d), 

JC(d) = N 2 (d)M 2 -\d) (7.3-21a) 

= M^\d)N 1 (d) (7.3-21b) 

satisfying the identities 

I p = A 1 (d)M 2 (d) + B 1 (d)N 2 (d) (7.3-22a) 

I m = M 1 (d)A 2 (d) + N 1 (d)B 2 (d) (7.3-22b) 

In order to minimize (19), it is first convenient to introduce some additional mate- 
rial on the description of wide-sense stationary processes and the transformations 
produced on their second order properties when they are filtered by time-invariant 
linear systems. Let v be a vector-valued stochastic process with finite variance, 
i.e. £{\\v(t)\\ 2 } < oo for all ( £ I. Assuming v to be wide-sense stationary, the 
two-sided sequence of its covariance matrices K v := {K v {k)} < £L_ 00 

K v {k) := £{v(t + k)v'(t)} (7.3-23a) 
= K' v (-k) (7.3-23b) 
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is called the covariance function of v. Last equality easily follows from wide-sense 
stationariety of v: 

£{v(t + k)v'(t)} = £{v(t)v'(t - k)} = [£{v(t - k)v'{t)}]' . 

The ^-representation ^ v (d) ( Cf. Chapter 3) of K v 

oo 

*„(d) := K v(k)d k (7.3-24a) 

k— — oo 

= *; (d- 1 ) = * * v (d) (7.3-24b) 

is called the spectral density function of v. Eq. (24b) shows that for d taking values 
on the complex unit circle, d~e 10 , 6 <G [0, 2tt), ^ v is Hcrmitian symmetric 

*„ (e 10 ) = % {e~ ie ) (7.3-24c) 

Note that 

^{II"WIIq} = Tr[Q^(0)] 

= 1\[Q{* v {d))\ (7-3-25) 

where, as in (3.1-12), the symbol ( ) denotes extraction of the 0-power term. 
Eq. (25) is the counterpart of (3.1-12) in the present stochastic setting. 

Problem 7.3-4 Consider a linear time-invariant system with transfer matrix H(d). Let its input 
u and output y be wide-sense stationary stochastic processes with spectral density functions ty u (d) 
and, respectively $> y (d). Then, show that 

*„(d) = H(d)^ u (d)H*(d) (7.3-26) 

By dealing with vector-valued wide-sense stationary processes, the notion of cross- 
covariance between two processes is already embedded in (23). In fact let a process 
z be partitioned into two separate processes u and v, z(t) — [ u'(t) v'(t) ] . Then, 
we have 

" K u (k) K uv (k) 
K vu {k) K v (k) 



K z {k) = 
where 



K uv (k) := £{u{t + k)v'{t)} 
= Ku(-k) 

is called the crosscovariance function of u and v. Similarly to (24), we define the 
cross-spectral density function of u and v 



*uv(d) := J2 K uv{k)d k 



k— — oo 

= Ku(d- 1 ) = Ku(d) 

Problem 7.3-5 Consider two wide-sense stationary processes u and v with cross-spectral 
density ty uv (d). Let y and z be other two wide-sense stationary processes related to u and v as 
follows 

y(t) = H yu (d)u(d) z(t) = H zv {d)v(t) 

where H yu (d) and H zv (d) are rational transfer matrices. Then show that 

^yz(d) = H yu (d)<f uv (d)H* v (d) 
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Problem 7.3-6 Let u and v be wide-sense stationary processes with cross-spectral density 
* u „(d). Show that 

£{v'(t)Qu(t)} = Tr[Q(V uv (d))} 

Further, let H(d) be a rational transfer function such that z(t) = H(d)u(t), dimz(t) = dimv(t), 
is wide— sense stationary. Show that 

£{v'(t)[H(d)u(t)]} =£{[H*(d)v(t)]'u(t)} . 



We next express y(t) and u(t) in terms of the exogenous input e for the closed-loop 
system (1) and (20). 

Let A^ l {d)Bi(d) and B 2 (d)A 2 ~ 1 (d) be irreducible left and, respectively, right 
MFDs of A~ 1 (d)B(d) 



A- 1 (d)B(d) 



A^{d)B 1 {d) 
B 2 {d)A 2 \d) 



(7.3-27a) 
(7.3-27b) 



then, for the regulated system (1) and (20) we find 

y(t) = -A^ 1 (d)B 1 (d)N 2 (d)M 2 - 1 (d)y(t)+A- 1 (d)C(d)e(t) 

= [I p + A^ 1 (d)B 1 (d)N 2 (d)M 2 - 1 (d)] _1 A- 1 (d)C(d)e(t) 

= M 2 (d) [A 1 {d)M 2 {d) + B^N^d)]- 1 A 1 (d)A- 1 (d)C(d)e(t) 

= M 2 (d)A 1 (d)A- 1 (d)C(d)e(t) [(22a)] 

= [I P - B 2 (d)N 1 (d)]A- 1 (d)C(d)e(t) [(3.2-23a)] 



(7.3-28) 



Further 

u(t) = -N 2 (d)M 2 1 (d)y(t) 

= -N 2 (d)A 1 (d)A- 1 (d)C(d)e(t) 

= -A 2 (d)N 1 (d)A- 1 (d)C(d)e(t) 

Using (25), (19) can be rewritten as follows. 

C = Tr<^„* y (d)+^ u * u (d)) 
where, by (28), (29) and (26), 



[(28)] 
[(3.2-30a)] 



(7.3-29) 



(7.3-30) 



tf„(d) = I p — B 2 (d)Ni(d) A~ 1 {d)D{d)D*(d)A~*(d) I p — N{{d)B 2 (d) 
^u{d) = A 2 (d)N 1 (d)A- 1 (d)D(d)D*(d)A-*(d)Nl(d)A* 2 (d) 

In the above equations A~*(d) := [A^ 1 (d)]*, and D(d) is a pxp Hurwitz polynomial 
matrix such that 

D(d)D* (d) = C{d)V e C* (d) (7.3-31) 

Problem 7.3-7 Consider the plant 

A(d)y(t) = B(d)u(t) + L(d)u(t) (7.3-32) 

with L(d) possibly non Hurwitz and rectangular, the cost (19), and the admissible regulation 
strategy 

u(t) = -K(d)z(t) (7.3-33) 

where 

*(*) = !/(«) + C(*) (7-3-34) 
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In the above equations v and f are two mutually uncorrelated zero-mean wide-sense stationary 
white processes with K v {k) = ^v&kfi an d = ^(^k,0- Show that the above steady-state 

LQ stochastic regulation problem is equivalent to the one for the CARMA plant 

A(d)z(t) = B(d)u{t) + D(d)e(t) (7.3-35) 

the admissible regulation strategy (33), and the cost 

C = £{\W)\\% y +\\u{t)\\% u } (7.3-36) 

provided that e is zero-mean wide-sense stationary and white with identity covariance matrix, 
and D(d) is a p X p Hurwitz polynomial matrix solving the following left spectral factorization 
problem 

D(d)D*(d) = L(d)^ v L*(d) + A(d)y ( A*(d) (7.3-37) 

D(d) exists if and only if 

rank[ L{d)^ v A(d)^ ( ]=p = dimz (7.3-38) 



Using the expressions for ^ y (d) and ^f u (d), we find 

C = Tr(D*(d)A-*(d)\N*(d)E*(d)E(d)N 1 (d)- 

yj y B 2 (d)N 1 (d) - N{{d)B* 2 {d)i> y +i> y ^ A- 1 (d)D(d)) 

where E(d) is an m x m Hurwitz polynomial matrix solving the right spectral 
factorization problem (Cf. (4.1-12)) 

E*(d)E(d) = A* 2 (d)ijj u A 2 (d) + B*{d)i> y B 2 {d) (7.3-39) 

E(d) exists if and only if 



rank 

C can be reorganized as follows 



ip u A 2 (d) 
ip y B 2 (d) 



dimu (7.3-40) 



C = d+C 2 (7.3-41a) 

with 

Ci := Tr(£* (d)C(d)) (7.3-41b) 

C{d) := [E-*{d)B*(d)4> y - E(d)N 1 (d)] A- 1 {d)D(d) (7.3-41c) 

C 2 := Tr(D*(d)A-*(d) [i/> y - i) y B 2 (d)E- 1 (d)E-*(d)B* 2 (d)i>y] A- 1 (d)D(d)) 

(7.3-41d) 

Note that C 2 is not affected by the choice of JC(d). Thus, the problem amounts to 
finding Ni(d) minimizing (41b). This equals the square of the ^-norm (Cf. (3.1- 
12)) of the matrix sequence C(d). In turn, C(d) has two additive components. One, 
E(d)Ni(d)A^ 1 (d)D(d) is causal. The other, which results from premultiplying the 
causal sequence ij) y A~ 1 (d)D(d) by E~*(d)B 2 (d), is a two-sided matrix sequence. 
The situation is similar to the one met in the deterministic LQ regulation problem: 
Cf. (4.1-20)-(4.1-23). We then follow the same solution method as in Sect. 4.2. 

Causal Anticausal Decomposition Let 



q := max{dA 2 (d), dB 2 (d), dE(d)} 



(7.3-42a) 
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A 2 (d) := d q A%(d); B 2 (d) := d q B* 2 {d); E(d) :— d q E*(d) (7.3-42b) 
Then, (39) can be rewritten as 

E(d)E(d) = A 2 (d)^ u A 2 (d) + B 2 (d)^ y B 2 (d) (7.3-43) 

Likewise, the first additive term on the R.H.S. of (41c) can be rewritten as 

C(d) E- 1 (d)B 2 (d)i>yA- 1 (d)D(d) (7.3-44) 

Suppose now that we can find a pair of polynomial matrices Y(d) and Z(d) fulfilling 
the following bilateral Diophantinc equation 

E(d)Y(d) + Z(d)A 3 (d) = B 2 (d)yj y D 2 (d) (7.3-45a) 

with the degree constraint 

dZ(d) < dE(d) = q (7.3-45b) 

Last equality follows from the fact that, being E(d) Hurwitz, E(0) is nonsingular. 
In (45a) we have denoted by A 3 (d)D 2 1 (d) an irreducible right MFD of A(d)D^ 1 (d) 



D- 1 (d)A(d) = A 3 {d)D 2 1 {d) 



(7.3-46) 



Using (45a) in (44) we find 



C{d) =£+{d) +C-(d) 



where 



C+(d) := Y{d)A 3 \d) 
C-{d) := E- 1 (d)Z(d) 



(7.3-47) 



are, respectively, a causal and a strictly anticausal and possibly i 2 sequence ( Cf. (4.2- 
10)). In conclusion, provided that we can find a pair (Y(d), Z(d)) solving (45), £(d) 
can be decomposed in terms of a causal sequence £+(d) and a strictly anticausal 
sequence £-(d) as follows 



£(d) =C+{d) +C-{d) 

C+{d) := Y(d)A 3 1 (d) - E(d)N 1 (d)A- 1 (d)D(d) 
Hence, by the same argument as in (4.1-24)-(4.1-28), we have 

Ci - Tr (\c* + (d) + c*_(d)] [r+(d) + r_(d)l) 

= TY(C* + (d)C+(d))+C 3 



where 



C 3 := (C*_(d)C-(d)) 



(7.3-48) 
(7.3-49) 



(7.3-50) 



(7.3-51) 



With C 2 as in (41d), assume that C 2 + C 3 is bounded. Then an optimal Ni(d) is 
obtained by setting £+(d) = O nxp , i.e. 



JVi(d) = E- 1 (d)Y(d)A 3 1 (d)D- 1 (d)A(d) 
= E-\d)Y{d)D 2 \d) [(46)] 



(7.3-52) 
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Stability The remaining part of the transfer matrix of the optimal regulator, viz. 
the stable transfer matrix M\(d), can be found via (22b): 

M 1 (d)A 2 (d) = I m -N 1 (d)B 2 (d) 

= I m - E- 1 (d)Y(d)D^ 1 (d)B 2 (d) 

The problem here is that the Diophantine equation (45), even if solvable, need not 
have a unique solution N±(d). In such a case, some solutions may not yield stable 
transfer matrices M\(d) via the above equation. The situation is similar to the one 
faced for the deterministic LQR problem as discussed in Sect. 4.3. By imposing 
stability of the closed-loop system, we obtain, under general conditions, uniqueness 
of the solution. To this end, we write 



Mi(d) 



A 2 \d) 



E- 1 (d)E- 1 (d) E(d)E{d) - E{d)Y(d)D 2 - 1 (d)B 2 {d) 
= E- 1 (d)E- 1 (d) \B(d)E(d) + Z(d)A 3 (d)D2 1 (d)B 2 (d) - 

B 2 {d)^ v B 2 {d)\A^{d) [(45a)] 
= E- 1 {d)E- 1 {d)[A 2 {d)i> u + Z{d)B z {d)D^\d)\ [(43), (46), (53)] 

To get the last equality we have set 

B 3 (d)D^(d) := D- 1 (d)B(d) 
with B 3 (d) and Di(d) right coprimc polynomial matrices. Hence, 

Afi(d)Di(d) - E-\d)E-\d) \Ai{d)il> u Di{d) + Z(d)B 3 (d) 



(7.3-53) 



(7.3-54) 



Recall that by (31) or (37) D\(d) is Hurwitz, and E(d), being E{d) Hurwitz, is anti 
Hurwitz. Then, a necessary condition for stability of M\{d) is that the polynomial 
matrix within brackets in (54) be divided on the left by E{d). I.e., there must be 
a polynomial matrix X(d) such as to satisfy the following equation 



E(d)X(d) - Z(d)B 3 (d) = A 2 (d)il>uDM 



(7.3-55) 



By the same argument used after (4.3-7), it follows that X{d) is nonsingular. Recall- 
ing (1), (21), (52), (54) and (55), we conclude that in order to solve the steady-state 
LQSL regulation problem, in addition to the spectral factorization problems (31) 
or (37) and (39), we have to find a solution (X(d),Y(d), Z(d)) with dZ(d) < dE{d) 
of the two bilateral Diophantine equations (45) and (55). Using (55) in (54), we 
find 

Afi(d) = E-\d)X(d)D^ 1 (d) (7.3-56) 
This, together with (21) and (52), yields 

K(d) = D 1 (d)X- 1 (d)Y(d)D^ 1 (d) (7.3-57) 

We then see that Z(d) in (45) and (55) plays the role of a "dummy" polynomial 
matrix. By eliminating Z(d) in (45) and (55), we get 



X(d)D^(d)A 2 (d) + Y(d)D^ 1 (d)B 2 (d) = E(d) (7.3-58) 
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Then, in turn, setting 

with the expression on the R.H.S. an irreducible right MFD, can be rewritten as 

X(d)A 4 (d) + Y(d)B 4 {d) = E{d)D 3 (d) (7.3-60) 

It is instructive to compare (60) with (2-22) and conclude that, in the absence of 
possible cancellations, the ^-characteristic polynomial of the steady-state LQSL 
regulated system is proportional to det E(d) ■ det C(d). 

Problem 7.3-8 Show that a triplet (X(d), Y(d), Z(d)) is a solution of (45) and (55) if and only 
if it solves (45) and (58). [Hint: Prove sufficiency by using (43).] 



Finally, the closed-loop feedback system (1) [(2,2) j, (20) (J33)J, (21) and (22) 

is internally stable if and only if (52) and (56) are stable transfer matrices. The 
following lemma sums up the above results. 

Lemma 7.3-1. Provided that: 

i. Eq. (45a) and (55) [or (45a) and (60)] admit a solution (X(d), Y(d), Z(d)) 
with dZ(d) < dE(d); and 

ii. the transfer matrices (52) and (56) are both stable, 

the steady-state LQSL regulator is given by the dynamic feedback compensator 

u(t) = -D 1 (d)X- 1 (d)Y(d)D^ 1 (d) (7.3-61) 

The minimum cost achievable with the optimal regulation law (61) equals 

Cmin = C 2 + C 3 (7.3-62) 



Solvability It remains to establish conditions under which (45) and (55) are 
solvable. 

Problem 7.3-9 Modify the proof of Lemma 4.4-1 to show that if, according to assumption 
(2c), the greatest common left divisors of A(d) and B(d) are strictly Hurwitz, there is a unique 
solution (X(d),Y(d),Z(d)) of (45) and (55) [or (45) and (60)]. 

In [HSG87] the use of the single unilateral Diophantine equation (60), instead of 
both (45) and (55) or (60), was discussed and shown to be possible provided that 
A(d) and B(d) are left coprime. This is the counterpart in the present stochastic 
setting of the result in Lemma 4.4-2. 

The following theorem immediately follows from Lemma 2 and Problem 4. 

Theorem 7.3-3. (Steady-state LQSL regulator) Consider the CARMA plant 
(1) subject to the conditions (2). Let (40) be fulfilled. Then, (45) and (55) [or (45) 
and (60)] admit a unique solution (X (d) ,Y (d) , Z (d)) of minimum degree w.r.t. 
Z(d). Further, the linear time-invariant feedback compensator (61) yields an in- 
ternally stable closed-loop system if and only if (52) and (56) are both stable transfer 
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matrices. In such a case the steady-state LQSL regulator is given by (61) and yields 
the minimum cost 

C min = minTr[^^ + V«*«] (7.3-63) 

K(d) 

= c 2 +c 3 

■^y := Cov := Cov (u(t)j 

Remark 7.3-3 In Theorem 3 conditions for the existence of the steady-state 
LQSL regulator arc implicit, and they can be only checked after computing the 
transfer matrices (52) and (56). Explicit sufficient conditions for solvability and 
uniqueness of the steady-state LQSL regulator, in addition to (2) and (3), are the 
following: 

i. D(d) is strictly Hurwitz; and 

ii. E(d) is strictly Hurwitz 

In the steady-state LQG regulation problem vf^ > is a stronger analogue of i) 
(Cf. also (37)), while ii) is fulfilled in case ip y > and ip u > (Cf. Problem 10 
below) . Taking into account the difference in the plant models that are adopted in 
the two cases, we can thus conclude that solvability and uniqueness conditions for 
both steady-state LQSL regulation and steady-state LQG regulator are basically 
the same. □ 

Problem 7.3-10 Consider the right spectral factorization problem (39). Show that E(d) is 
strictly Hurwitz if ip u > and 4> y > [Hint: Prove that E' (e^ J0 ) E (e^ e ) > 0, e [0, 2ir). ] 

Problem 7.3-11 Find the polynomial equations giving the steady-state LQSL regulator for 
the CAR plant A(d)y(t) = B(d)u(i) + e(t). Compare these equations with the ones of Chapter 4 
solving the problem in the deterministic case. 

In applications it is often important to consider, instead of (19), a performance 
index involving filtered versions of y and u 

C = £{\\W y (d)y(k)\\ 2 + \\W u (d)u(k)\\ 2 } (7.3-64) 
= Tr(W y (d)y y (d)W;(d) + W u (d)V u (d)W:(d)) 

In (64) W y (d) and W u {d) denote two stable transfer matrices that we represent by 
irreducible right MFDs 

W y {d) = B y (d)A- 1 (d) W u {d) = B u {d)A-\d) (7.3-65) 

Problem 7.3-12 (Cost with polynomial weights) Consider the cost (64) with W y (d) = B y (d) 
and W u (d) = B u (d). Show that the polynomial equations giving the related steady-state LQSL 
regulator are the same as the ones obtained for the simples cost (19), once the following changes 
are adopted 

ip y i-» B*(d)B y (d) B* u (d) B u (d) 

Namely, 

E*(d)E{d) = AZ(d)B*(d)B u {d)A 2 (d) + BZ(d)B y (d)By(d)B 2 (d) (7.3-66a) 

E(d)E(d) = A 2 {d)B u (d)B u A 2 {d) + B 2 (d)By(d)By(d)B 2 (d) (7.3-66b) 

where E(d) := d^E*(d), A 2 (d)B u (d) := d« A*(d)B*(d), B 2 {d)B y (d) := di B*(d)B*(d), q := 
max{dB u (d) +dA 2 (d),dB y (d) + dB 2 (d),dE(d)} 

E(d)Y(d) + Z{d)A 3 (d) = B 2 (d)B y (d)B y (d)D 2 (d) (7.3-67) 

E(d)X(d) - Z(d)B 3 (d) = A 2 {d)B u {d)B u (d)D 1 (d) (7.3-68) 

with the optimal regulation law given again by (61). 
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Problem 7.3-13 Show that the steady-state LQSL regulator problem for the SISO CARMA 
plant (1) and the cost (19) can be equivalently reformulated for the CAR plant 

A(d)y c (t) = B(d)u c (t) + e(t) 

C(d)y c (t)=y(t) C(d)u c (t) = u(t) 

and the cost 

£ [ip y [C(d)y c (t)] 2 + ip u [C(d)u c (t)} 2 } 
Then find the steady— state LQSL regulator by using the results of Problems 11 and 12. 



Problem 7.3-14 (Cost with dynamic weights) Consider the cost (64) with W y (d) and W u {d) 
as in (65). Exploit the solution of Problem 12 to show that the polynomial equations giving 
the related steady-state LQSL regulator are again as in (66)-(68) provided that the following 
definitions arc adopted: 

[A(d)A y (d)r 1 B(d)A u (d) = B 2 (d)A^(d) (7.3-69) 

D~ 1 (d)A(d)A y (d) = A 3 (d)D 2 ' 1 (d) D- 1 (d)B(d)A u (d) = B :i (d)D- 1 (d) (7.3-70) 

with A 2 (d) and B 2 (d), D 2 (d) and As(d), and D\(d) and B$(d) all right coprime. 
Finally the optimal regulation law is as follows 

u(t) = -A u (d)D 1 (d)X- 1 (d)Y(d)D- 1 (d)A- 1 (d)y(t) (7.3-71) 
[Hint: Define new variables ^(t) := Ay 1 (d)y(t) and u(t) := A^ 1 (d)u(t). } 



Problem 7.3-15 (Stabilizing Minimum-Variance Regulation) Consider the solution of the 
steady-state LQSL regulation problem when the control variable is not costed, viz. i/>„ = O mX m- 
The resulting regulation law will be referred to as Stabilizing Minimum-Variance regulation 
since the polynomial solution insures internal stability if D(d) and E(d) are strictly Hurwitz. 
Find the Stabilizing Minimum- Variance regulation law for two SISO CARMA plants, of which 
one minimum and the other nonminimum-phase. Compare these results with those pertain- 
ing to Minimum-Variance regulation. Contrast Stabilizing Minimum- Variance regulation with 
Minimum- Variance regulation. 



Problem 7.3-16 (LQSL regulator information pattern) Consider the LQSL regulator for a 
SISO CARMA plant with A(d), B(d), C(d) having unit gcd and ordB(d) = l + £. Show that the 
LQSL regulation law X(d)u(t) = —Y(d)y(t) is such that 

dx(d) -S 9B(d)-l 4>u=0 
OA W ~ \ max{dB(d) - 1, dC(d)} f u > 

dY(d) = max{dA(d) - 1, dC(d) - 1 - £} 



7.3.3 LQSL Regulator Optimality among Nonlinear Com- 
pensators 

Suppose that, instead of just assuming the innovations process e in the CARMA 
plant (1) white, we adopt for e the martingale difference properties (3). Then, as 
shown hereafter, the steady-state LQSL regulator of Theorem 3 turns out to be 
also optimal among all possibly nonlinear compensators. 

Consider the CARMA plant (l)-(3), the cost (19) and the admissible regulation 
strategy (4). We wish to consider the following problem. 

Steady-State LQ Stochastic (LQS) Regulator Consider the CARMA 
plant (l)-(2), the cost (19) and the admissible regulation strategy (4). As- 
sume that the innovations e have the martingale difference properties (3). 
Among all possibly nonlinear regulation strategies, find, whenever they exist, 
the ones which make the closed-loop system internally stable and minimize 
(19). 
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We point out that, if the steady-state LQSL regulator exists, the admissible regu- 
lation strategy (4) can be written as 

u{t) = -M^ 1 (d)N 1 (d)y(t)+v(t) 

= -N 2 (d)M^ 1 (d)y(t)+v(t) (7.3-72) 

where: Ni(d) and M\(d) are the stable transfer matrices in (52) and, respectively, 
(56) whose ratio K{d) — M^ 1 (d)Ni(d) defines the steady-state LQSL regulator 
transfer matrix; N 2 (d) and M 2 (d) are such that fC(d) = A r 2 (d)M 2 " 1 ((i) and satisfy 
(22a); and v(t) is any wide-sense stationary process such that 

v(t)ea{y*}=a{e t } (7.3-73) 

Thus, the above stated steady-state LQS regulation problem amounts to finding a 
process v as in (73) such that the corresponding regulation law (72) stabilizes the 
plant and minimizes (19). 

Under the feedback control law (72), y and u can be expressed in terms of e 
and v as follows. 



y(t) 


= Ve(t) + y v (t) 


(7.3-74a) 


Ve(t) 


:= M 2 {d)A l {d)A- 1 {d)C{d)e{t) 


(7.3-74b) 


Vv{t) 


:= B 2 {d)M 1 {d)v{t) 


(7.3-74c) 


«(*) 


= u e (t)+u v (t) 


(7.3-75a) 


u e (t) 


:= -A 2 (d)N 1 (d)A- 1 (d)C(d)e(t) 


(7.3-75b) 


u v (t) 


:= A 2 {d)M 1 {d)v{t) 


(7.3-75c) 



Problem 7.3-17 By using (22) and (3.2-32b), verify (74) and (75). 

With reference to the above decompositions, the cost (19) can be split as follows 
where 



C e 



c 
c 



£ 



{\\ye(t)\\l y + \\u e (t)\\l u } 



2£ {y' e {t)^vVv{t) + u' e {t)xp u u v {t)} (7.3-76) 

£ {lM*)ll^ + IM*)ll^} 

= £ {\\E(d)M 1 (d)v(t)r} (7.3-77) 

Problem 7.3-18 Using (25) and the results of Problem 5, show that 

C ev = 2Tr(V ev (d) [H* yv {d)i> y H y ,(d) + H* uv (d)^ u H ue (d)] ) 

H yv (d) := B 2 (d)M 1 (d) H ye (d) := M 2 (d)A 1 (d)A- 1 (d)C(d) 

H uv (d) := A 2 (d)M 1 (d) H ue (d) := -A 2 (d)N 1 (d)A- 1 (d)C(d) 
where ty ev (d) is defined as follows. Let 

K ev (k) := £{e(t + k)v'(t)} (7.3-78) 

= Ke(-V 

Then 

OC 

* ev (d) := Yl K ev(k)d k (7.3-79) 

fc— — OO 

= *UO =■■ KM 
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Problem 7.3-19 Using the results of Problem 18, conclude that 

Cev = 2Tr{Z(d)M 1 (d)V ve (d)) (7.3-80) 
Z(d) := d q Z*(d) 
where q and Z(d) are as in (42) and, respectively, (45). 

We next show that, from the martingale difference properties (3) of the innovations 
process e, it follows C ev =0. In fact, by the smoothing properties of conditional 
expectations (Cf. Appendix D.3) 

K ev {k) = £{£{e(t+k) | e*}i/(t)} 
= O pxm , Vfc > 1 [(3a)] 

Thus, K ev (-) [K ve (-)] is an anticausal [causal] matrix sequence. Since M\(d) is 
causal and, by (45b), Z(d) strictly causal, it follows that (Cf. Sect. 4.2) (80) van- 
ishes. 

Lemma 7.3-2. Consider the cost C ev (76) where the involved processes are as in 
(74) and (75). Let v satisfy (73) and e the martingale difference properties (3). 
Then C ev = 0. 

It then follows that the optimal process v to be used in (73) equals O m a.e. We 
have thus established the desired result. 

Theorem 7.3-4 (Steady— state LQS regulator). Whenever it exists, the steady- 
state LQSL regulator of Theorem 3 is also optimal among all possibly nonlinear reg- 
ulation strategies (4), provided that the innovations process e enjoys the martingale 
difference properties (3). 

We finally point out that if the CARMA plant (l)-(3) is the innovations rep- 
resentation of the "physical" plant (2-14), in order that (3) hold, it is essentially 
required that the processes in the plant (2-14) be jointly Gaussian. 

Main points of the section The polynomial equation approach can be used to 
solve steady-state LQ stochastic regulation problems. The Single Step Stochastic 
regulator does not involve any spectral factorization problem and can be computed 
by solving a single Diophantine equation. As with its deterministic counterpart, 
applicability of Single Step Stochastic regulation is limited by the fact that it yields 
an internally stable feedback system only under restrictive assumptions on the plant 
and the cost weights. The polynomial solution for the steady-state LQ Stochastic 
regulator of CARMA plants can be obtained along the same lines as the one fol- 
lowed for the deterministic steady-state LQOR problem of Chapter 4. The case of 
dynamic weights in the cost, can be accomodated in a straightforward way within 
the equations solving the standard case of constant weights. 

7.4 Monotonic Performance Properties of LQ Sto- 
chastic Regulation 

We report a discussion on some monotonicity properties of steady-state LQ sto- 
chastic regulation. As will be seen in due time, these properties are important for 
establishing local convergence results of adaptive LQ regulators with mean square 
input constraints. 
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We consider a performance index parameterized by a positive real input weight 
P, P > o, 

C = £ {\\y(k)f + P\\u(k)\\ 2 } (7-4-1) 
= Tir [*„ + />*„] 

where ^f y — (^ y (d)} and *f> u — (^ u (d)) are the covariance matrices of the wide- 
sense stationary processes y and, respectively, u. More precisely, = ^ y (p) and 
= ^u(p) are the covariance matrices of y and u in stochastic steady-state 
when the plant is regulated by the steady-state LQ stochastic regulator optimal 
for the given p. We then show that Tr [^ u (p)] (Tr [^(p)]) is a strictly decreasing 
(increasing) function of p. To see this, consider two different values of p, Pi,i= 1,2. 
Let denote the stochastic steady-state covariance matrices pertaining to 

the steady-state LQ stochastic regulator minimizing (1) for p = pi. Recall that 
under suitable assumptions (Cf. Sect. 2 and 3) for every p > there exists a unique 
optimal steady-state LQ stochastic regulator. Then, the following inequalities hold 

T*[*J + pi*i] < Tr[*l + Pl *l] 
Tr[* y + p 2 *l] < Tr[^ + p 2 *y 



or, equivalently, 



Hence 



. > > n , f Tr [* u (p 2 )] < Tr [*„ ( Pl )] 
P2>P1> °^\ Tr[^(p 2 )]>Tr[^( Pl )] ^ 

Theorem 7.4-1. Consider a steady-state LQ stochastic regulation problem with 
the quadratic performance index (1) parameterized by the positive input weight 
p. Assume the problem solvable. Let ^ u (p) and ^ y {p) be the stochastic steady 
state covariance matrices of u and y when the plant is fed back by the steady-state 
LQ stochastic regulator optimal for the given p. Then, (2) hold, viz. Tr[^ M (p)] 
(Tr [ v Pj,(p)]) is a strictly decreasing (increasing) function of p. 

Problem 7.4-1 Extend the conclusion of Theorem 1 to the steady-state LQS regulator and the 
cost 

C = £ {\\y{k)\\% y +P\Hk)\\l u ) 

Besides its use in the analysis of adaptive LQ regulators with mean-square input 
constraints, the above monotonic performance properties are important for design- 
ing purposes, in that they allow us to trade between output and input covariance 
matrices. 

We point out that the monotonic performance properties in Theorem 1 follow 
from the minimization of the unconditional expectation (1) achieved by steady- 
state LQ stochastic regulation. In contrast, Single Step Stochastic regulators, which 
in stochastic steady-state minimize the conditional expectation (3-5) , in general do 
not possess similar monotonic performance properties. 

Example 7.4-1 [MS82] Consider the SISO CARMA plant (3-1) with 
A{d) = 1 - 2.75d + 2.61d 2 + 0.885d 3 
B(d) = d- 0.5d 2 
C(d) = 1 - 0.2d + 0.5rf 2 - O.ld 3 



Sect. 7.5 Steady-State LQS Tracking and Servo 



215 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
5 10 15 20 



e{u\i)} 

Figure 7.4-1: The relation between £ {w 2 (fc)} and £ {y 2 (fc)} parameterized by p 
for the plant of Example 1 under Single Step Stochastic regulation (solid line) and 
steady-state LQS regulation (dotted line). 

Using the Single Step Stochastic regulator optimal for the cost 

C = e{y 2 (k + l) + pu 2 (k)\y k ] 

the plant under consideration gives rise to non— monototic relationships between the stochastic 
steady-state input variance £{u 2 (k)}, the stochastic steady-state output variance £{y 2 (k)}, 
and the input weight p (Fig. 1). On the contrary, as guaranteed by Theorem 1, steady— state LQS 
regulation yields monotonic performance relationships (dotted line in Fig. 1). 

Main points of the section In contrast with Single Step Stochastic regulated 
systems, steady-state LQS regulated systems possess performance monotonicity 
properties which enable us, by varying an input weight knob, to trade off between 
output and input covariance matrices. These monotonicity properties turn out 
to be important to establish convergence results for self-tuning regulators with 
mean-square input constraints. 

7.5 Steady— State LQS Tracking and Servo 
7.5.1 Problem Formulation and Solution 

We consider the CARMA plant (3-l)-(3-3) along with a wide-sense stationary 
reference process r, dim r(i) = dimj/(t) = p. We assume that 

r and e are mutually independent processes. (7-5-1) 

We wish to consider the following problem. 

Steady State LQS Tracking and Servo Problem Given: the CARMA 
plant (3-l)-(3-2) with innovations satisfying the martingale difference proper- 
ties (3-3); the performance index consisting of the unconditional expectation 

£{\\y(t)-r(t)\\l y + \\u(t)\\l u } (7.5-2) 
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with ip y — ip' y > 0, ip u = ip' u > 0, and r a wide-sense stationary reference; the 
admissible control strategy 

u(t)^a{y\ U t~\r*+ 1 } = a{y\r t+e } ; (7.5-3) 

find, among all the admissible strategies, the ones making the closed-loop 
system internally stable and minimizing (2). 



It is clear from (3) that we are searching for an optimal 2-DOF controller (Cf. Sect. 5. J 
From the time being, we aim at solving the mathematical problem we have just 
formulated with no concern on the controller integral action, or dynamic weights 
in (2). As will be shown, these issues can be accommodated in the basic theory 
by suitable simple modifications. Another point that we underline is that our ad- 
missible control strategy (3) allows us to select the present input u(t) knowing the 
reference up to time t + I. If £ is positive and very large, the situation appears 
as an extension of the deterministic 2-DOF LQ control problem of Theorem 5.8-1 
to the present stochastic setting. We have already noticed the improvement in 
performance that can be achieved, especially with nonminimum-phase plants, by 
exploiting the knowledge of the reference future, provided that this is available to 
the controller. For the sake of generality, in this section we assume that £ can be 
any integer. According to the sign of £, we adopt two different names for the prob- 
lem. We call it either a tracking problem, if the reference is known to the controller 
with a delay of \£\ steps, I < 0, or a servo problem if I > 0. The solution that we 
are to find, allowing £ to take any value, can be used for both the tracking and the 
servo problem. 

To begin with, let us first assume that the underlying steady-state LQS pure 
regulation problem, viz. the one with w(t) = O p , is solvable. Recall that its solution 
is given by Theorem 3-3 and Theorem 3-4 in the following form 

Mi(d)«(t) = -JVi(d)i/(t) 

with 

M l {d)A 2 {d) + N 1 {d)B 2 {d) = I m 

We now follow a line similar to that adopted after (3-72). Thus, any admissible 
control law (3) can be written as 

u(t) = -M^\d)N 1 {d)y{t) + v(t) (7.5-4) 

with v(t) any wide-sense stationary process such that 

v(t)ea{y\r t+e } (7.5-5) 

Under the closed-loop control law (5), y and u can be expressed in terms of e and 
v as in (3-74)-(3-75). Consequently, the cost (2) can be split as follows 

C ee := £{\\y e (t)\\l y + \\u e (t)\\l u } 

C ev := 2£ {y' e {t)^ y y v {t) + u' e {t)^ u u v {t)} 



where 
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C er := -2£{r'{t)ilj y y e {t)} = O p><p [(1)] 

Crr ■= £{\\r(t)\\l v } 

C vr := -2£ {y' v (t)i; y r(t)} 

Cw ■■= £{\\yv(t)\\l y + \\Mt)\\l u } 

= SlWEWMtWvm 2 } [(3-77)] 

We now show that the key property C ev = 0, that was proved in Lemma 3-3 for 
the underlying LQ stochastic pure regulation problem, holds true if v satisfies (5). 

Lemma 7.5-1. Consider the cost C ev where the involved processes are as in (3-74) 
and (3-75) with v satisfying (5). Let e have the martingale difference properties 
(3). Then C ev = 0. 

Proof Here (3-80) still holds true. We also have 

K ev (k) = £{e(t + k)v'(t)} 

= £{£{e(t + k)\y t ,r t+l )v'(t)) 
= £{£{e(t + k) \e t }v'(t)} [(1)] 

= Opxm , Vfe > 1 

Hence, the proof follows by the same argument as in Lemma 3.3. 

The main result can now be stated. 

Theorem 7.5-1. (Steady— state LQS tracking and servo) Suppose that the 
underlying steady-state LQS pure regulation problem is solvable. Let the left spectral 
factor E(d) in (3-39) be strictly Hurwitz. Then, the optimal control law for the 
steady-state LQS tracking and servo problem is given by 

Mi(d)u(t) = -JVi(d)y(i) + u c (t) (7.5-7) 

where M\(d) and N\(d) are the stable transfer matrices solving the underlying 
steady-state LQS pure regulation problem, andu c (t) is the command or feedforward 
input defined by 

u c (t) = E-\d)£ {E-*{d)B* 2 {d)^ y r{t) \ r t+e } (7.5-8) 

Proof Since in (6) C ee and C rr are n °t affected by v, C er = 0, and C ev = by Lemma 1, the 
optimal control law is given by (4) with v(t) g a {y t ,r t+e } minimizing, for z{t) := M\(d)v{t), 

C vv +C vr = £{\\E(d)z(t)f -2[B 2 (d)z(t)]'i/)yr(t)} 

= £ {\\E(d)z(t)\\ 2 - 2z'(t)[B^(d)i; y r(t)]} 

= £ { \\E(d)z(t)f - 2 [E{d)z(t)]' [E-*(d)BT 2 (d)^ y r(t)] } 

= £ {\\E(d)z(t) - E-*(d)B* 2 (d)tb y r(t)\\ 2 } - £ {\\E~* {d)B* 2 {d)^ y r{t)\\ 2 } 

While in the last line the second term is not affected by z(t), the minimum of the first is attained 
at z(t) = M 1 (d)v(t) = u c (t) with u c (t) as in (8). 

Recalling (3-52) and (3-5G), we have 

11(d) := E{d)M 1 {d)=X{d)D^ 1 {d) (7.5-9) 
S(d) := E{d)N 1 {d) = Y{d)D^ 1 {d) (7.5-10) 
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and (7) and (8) can be equivalently rewritten as follows 

TZ(d)u(t) = S(d)y(t) + v c (t) (7.5-11) 

v c (t) = £ {E-*{d)B* 2 {d)^ y r{t) | r t+e } (7.5-12) 

Eq. (11) and (12) should be compared with (5.8-26) and (5.8-27) which give the 
solution of the deterministic version of the present stochastic control problem. We 
refer the reader to the relevant part of Sect. 5.8 where the strictly "anticipative" 
nature of 

v(t) = E-*(d)B* 2 (d)Tp y r(t) 
= E-\d)B 2 (d)i>yr{t) 

was thoroughly discussed. 

Example 7.5-1 Consider again Example 5.8-2 where we found 
w(t) := E-*(d)BZ(d)i, y r(t) 



r(t + 1) + (1 - b 2 ) ^ b~ j r(t + j + 1) 



= fc- 1 

If I > 0, the feedforward input v c (t) can be decomposed as v c (t) + v c (i) 
v c (t) : = 



-1 



r(t + 1) + (1 - b 2 ) b- ] r(t + j + 1) 

3 = 1 

3 = 1 

Note that of these two components only v c (t) depends on the statistical properties of the reference 



v c {t) := k- 1 (1 - b 2 ) b- e+1 ^2e{r(t + £ + j) | : , 



Eq. (12) can be further elaborated when a stochastic model for the reference is 
given. In this connection let us assume that 

r(i) - G 2 {d)F 2 -\d)n{t) (7.5-13) 

where n, n(t) G R p , has the martingale difference properties 

£{n(k+ 1) | n k } = O p a.s. (7.5-14a) 

oo > £ {n{k+ l)n'(k+ 1) | n k ) = > a.s. (7.5-14b) 

G 2 (d)F 2 ~ 1 (d) is a right coprime MFD with 

G 2 (d) and F 2 (d) both strictly Hurwitz (7.5-15) 

Assumptions (15) entail no substantial limitation. In fact, since r(t) is wide-sense 
stationary, F 2 (d) must be strictly Hurwitz. Further, strict Hurwitzianity of G 2 (d) 
means that G 2 (d)F 2 1 (d) is stably invertible and, hence, (13) represents a standard 
innovations representation of r (Cf. Theorem 6.2-4). Let 

p := max {dE(d),dB 2 (d) - (MO)} (7.5-16) 

where A denotes minimum and dE(d) the degree of E(d). Let: 

E{d) :=d?E*{d) ■ B 2 (d) := dP+( M °) B*{d) (7.5-17) 
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Proposition 7.5-1. Let reference r be modelled as in (13)-(15). Then, under the 
same assumptions as in Theorem 1, the optimal feedforward input (12) is given by 

v c (t)^T(d)G^ 1 (d)r(t + e) (7.5-18) 

where T(d) and L(d) are m x p polynomial matrices given by the unique solution 
of minimum degree w.r.t. L(d), i.e. dL(d) < dE_(d), of the following bilateral Dio- 
phantine equation 

E{d)T{d) + L(d)F 2 (d) = d eva B 2 (d)iP y G 2 (d) (7.5-19) 
where V denotes maximum 

Proof First, from the assumptions on Efd) and F2(d), and Result C-5 and C-6 in the Appendix 
C, it follows that (18) has a unique solution (T(d), L(d)) with dL(d) < dE_(d). Next, taking into 
account (13), (12) becomes 

v c (t) =£^E-*{d)B^(d)i)yG 2 (d)F- 1 (d)n{t) \ n t+e } 

since, G 2 (d)F 2 ~ 1 (d) being stably invcrtiblc, cr{r'} = <r {«,'}. We consider separately the case 
£ > and I < 0. 

Assume £ > 0. Then, p = max{dE(d), 8B 2 (d)} and B 2 (d) = dPB^(d). Hence, 
v c (t) = £{E- 1 (d)B 2 (d)4>yG 2 (d)F 2 ; 1 (d)d e n(t + e)\n t + e } 

= £{E- 1 (d)\E{d)Y(d) + L{d)F 2 (d)]F- 1 (d)n{t + t) \ n*+ £ } 
= fjfr^F" 1 ^) + E- 1 (d)L(d)j n(t + e)\n t+e } 
= T(d)F 2 ~ 1 {d)n(t + 1) 

where the last equality follows from (14a) and the degree constraint on L(d). In fact, E_ —1 (d)L(d) 
turns out to be a strictly anticausal matrix (C/. 4.2-10). 

Assume i < 0. Then p = max{9E(d), 8B 2 {d) + and B_ 2 (d) = d p ~^B^(d). Hence, 

v c (t) = £l y Er 1 (d)B 2 (d)d-^i) y G 2 (d)F 2 1 {d)n{t)\n t+ ^ 
= £[Er 1 (d)B 2 (d)%l} y G 2 {d)F 2 1 {d)n(t + i) | n t+e j 
= £ [[r(d)F 2 1 (d) + Er 1 (d)L(d)j n(t + i) \ n t+l } 
= T(d)F 2 1 (d)n(t + e) 
where the last equality follows by the same argument used in the £ > case. 

Example 7.5-2 Consider again Example 1 and assume a first order AR model for the reference 

(1 - fd)r(t) = n(t) |/|<1 
Then, solving (19), we get for (18) 

Examples 1 and 2 indicate one of the advantages of the more general expression 
(12) over (18). In fact, when a stochastic model for the reference is not available, 
but £ is positive and large enough, from (12) a tight approximation of the optimal 
feedforward input can still be obtained simply by replacing v c (t) with v c (t). In the 
two examples above for v c (t) = v c (t) — i c (t) we find 



/ 2 (1 - 6 2 )*„ 



b 2| fe |2(£-l) (6 _ /)2(1 _ / 2 ) 

which, being \b\ > 1, decays exponentially as i increases. 
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Problem 7.5-1 (Tracking a predictable reference) Assume that the reference be a predictable 
process, viz. its realizations satisfy the equation 

W(d)r(t) = O p a.s. (7.5-20) 

where W(d) is a p X p polynomial matrix such that all roots of det W(d) arc on the unit circle 
and simple. Show that the optimal feedforward input v c (t) in (11) is given by the FIR filter 

v c (t) = T(d)r(t) (7.5-21) 

where T(d) and L(d) are m X p polynomial matrices given by the unique solution of minimum 
degree w.r.t. T(d), i.e. dF(d) < dW(d), of the following bilateral Diophantinc equation 

E(d)T(d) + L(d)W{d) = B 2 (d)i) y (7.5-22) 

where E(d) and _B 2 are as in (16) and (17) with £ = 0. 

Problem 7.5-2 Consider again Example 5.8-2. Assume that the reference r satisfies (20) with 
W(d) = 1 — d + d? . By using the results of Problem 1, prove that the optimal feedforward input 
v c (t) in (11) is given by 

kh 

= b 2_ b+l ^ 2b ~ + b( - b ~ 2 >'(* - *)] 

Problem 7.5-3 (1-DOF steady-state LQS tracking) Consider again the steady-state LQS 
tracking with the admissible control strategy (7.5-3) replaced by 

u(t) e <r{£ y *,u* _1 } 

where e y (t) := y(t) — r(t). Assume that the reference r is modelled as an ARM A process 

F(d)r(t) = G(d)u(t) 

with F(d) and G(d) both strictly Hurwitz, and v satisfying (14). Show that the problem can be 
reformulated as a steady-state LQS pure regulation problem. 



7.5.2 Use of Plant CARIMA Models 

The direct use of the results so far obtained for the tracking and servo problem 
does not insure asymptotic tracking of constant references and rejection of constant 
disturbances. In order to guarantee these properties, we can extend the approach 
of Sect. 5.8 to the present stochastic setting. 

We assume that the CARMA plant (3-l)-(3-2) is affected by a constant distur- 
bance n, A(d)n(t) = O p , A(d) :— diag(l — d), viz. 

A(d)y(t) = B{d)u(t) + C(d)e(t) + n(t) 

or, prcmultiplying by A(d), 

A(d)A(d)y(t) = B(d)Su(t) + A(d)C(d)e(t) (7.5-24) 

We see that, in the present stochastic case, the situation complicates by the pres- 
ence of the factor A(d) in the innovations polynomial matrix. According to (3-60), 
such a presence indicates that in the generic case some closed-loop eigenvalues of 
the steady-state LQS regulated system are located in 1. More specifically, in such 
a case, the implicit Kalman filter embedded in the steady-state LQS regulator 
would exhibit undamped constant modes. To avoid such an undesirable situation, 
a heuristic approach consists of acting in designing the control law as if the inno- 
vations e were a random walk A(d)e(t) = v(t) or 



(7.5-23) 



e(t) = e(t - 1) + v{t) 



(7.5-25) 
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with v a zero-mean wide-sense stationary white process with nonsingular covari- 
ance matrix. Note that (25) is unacceptable in that in stochastic steady-state yields 
a nonstationary process e with an ever increasing covariance matrix. Nonetheless, 
plain substitution of (24) with the model 

A(d)A(d)y(t) = B(d)5u(t) + C{d)v{t) (7.5-26) 

where v has the properties stated above, leads us to recover acceptable closed-loop 
eigenvalues for the steady-state LQS regulated system at the expense of a response 
degradation to the stochastic disturbances acting on the plant. The plant repre- 
sentation (26) is referred to as a CARIMA (Controlled Auto Regressive Integrated 
Moving Average) model. For control design of the plant (24), we use the model 
(26) along with the performance index 

£{\\y(t)-r(t)\\l y + \\5u(t)\\l u } (7.5-27) 

and the admissible control strategy 

5u(t) G cr{y',c5u'-V + ^} (7.5-28) 

Hence, by Theorem 1 we find the corresponding steady-state LQS tracking and 
servo solution. 



Problem 7.5-4 Assume that the steady— state LQS regulator for the CARIMA model (26) yields 
an internally stable feedback system. Then, show that for any constant reference r(t) = r, the 
2-DOF steady-state LQS controller resulting from (26)-(28) applied to the plant (24) yields an 
offset-free closed-loop system and asymptotic rejection of constant disturbances, provided that 
dim j/ = dimw. 

7.5.3 Dynamic Control Weight 

We extend to the present stochastic case the considerations made at the end of 
Sect. 5.8 on the effects of filtered variables in the cost to be minimized. Specifically, 
instead of (27), we consider the cost 

£{\\y H (t)-r(t)\\l y + \\5u H (t)\\l u } (7.5-29) 

where yu and uh are filtered versions of y and, respectively, u 

y H {t) = H{d)y{t) u H {t) = H(d)u{t) (7.5-30) 

We assume that H (d) is a strictly Hurwitz polynomial and, for the sake of simplicity, 
the plant to be SISO. The model (26) can now be represented as 

A(d)A(d)y H (t) = B(d)5u H (t) + H{d)C{d)v{t) (7.5-31) 

The 2-DOF steady-state LQS controller minimizing (29) for the plant (31) is given 

by (7) 

Mi(d)tfuij(t) - -NMVHit) +u c (t) 

Assuming A(d)A(d), B{d) and H(d)C(d) pairwise coprime, we find for the output 
of the model (24) controlled according to the above equation 

, , B(d) r , . X(d) A(d) ,. 

^ = m u%t) + ^)md) e{t) (7 - 5 - 32) 
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where <j 2 :— £ |e 2 (t)}, and X(d) and E{d) are the polynomials as in Sect. 3. 
Eq. (32) shows that filtering y(t) and 5u(t) as in (29) and (30) has the effect 
of filtering both the reference and the innovations by \/H(d). Notice however 
that the latter filtering action is only approximate in that also the solution of 
(3-45) and (3-55), and hence the polynomial X(d), is implicitly affected by H(d). 
According to such considerations, the use of a high-pass polynomial H(d), such that 
1 /H(d) cuts off frequencies outside the desired closed-loop bandwidth, may turn 
out to be beneficial for both shaping the reference and attenuating high frequency 
disturbances. Notice that in fact, if e(t) is white, in (32) A(d)e(t) has most of its 
power at high frequencies. 

Main points of the section The optimal 2-DOF controller of Sect. 5.8 is ex- 
tended to a steady-state LQ stochastic setting. The optimal feedforward action can 
be expressed in terms of the conditional expectation (8). This has the advantage of 
enabling us to approximately computing the optimal feedforward variable without 
using any reference stochastic model, provided that the reference is known a few 
steps in advance. CARIMA plant models whose inputs are plant input increments 
are often adopted in applications so as to asymptotically achieve tracking of con- 
stant references and rejection of constant disturbances. Dynamic cost weights in 
the performance index may be beneficial for both reference shaping and stochastic 
disturbance attenuation. 



7.6 Hoo and LQ Stochastic Control 

We have found that in steady-state LQ Stochastic control the characteristic poly- 
nomial of the optimally controlled system depends on the innovations polynomial 
C(d). Obviously, if C(d) = I p the robust stability properties of LQ regulated 
systems hold true since the results of Sect. 4.6 are still applicable. However, in gen- 
eral, stability robustness can deteriorate if an unfavourable C(d) polynomial is used. 
Such a situation has been already met with the plant (5-24). On that occasion, we 
have seen that a reasonable heuristic approach is to design the "optimal" controller 
for a mismatched plant model where the C(d) polynomial is suitably modified. A 
similar heuristic approach can be in general adopted so as to possibly recover the 
LQ robust stability properties: this is usually referred to as the LQG/LTR (Linear 
Quadratic Gaussian/Loop Transfer Recovery) technique [Kwa69], [DS79], [DS81], 
[Mac85], [IT86b], [AM90]. 

A more systematic approach is to consider the following minimax regulation 
problem. Let the plant be represented by 

y(t) = P(d)u(t) + Q(d)n(t) (7.6-1) 

where n is a p-dimcnsional zero mean white disturbance with identity covariancc 
matrix, and P(d) and Q(d) are rational transfer matrices such that P(0) — O pX rn- 
We note that 

£ {\\Q(d)n(t)\\ 2 } = Tr{Q(d)* n (d)Q*(d)) (7.6-2) 
= Tr(Q*(d)Q(d)) = \\Q(d)\\ 2 

where ||Q(d)|| denotes the norm introduced in (3.1-12). Here ||Q(rf)|| 2 equals the 
power of the disturbance Q(d)n(t), i.e. the sum of the variances of its components. 
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The Minimax LQ Stochastic Linear Regulation problem consists of finding linear 
compensators 

u(t) = -K(d)y(t) (7.6-3) 

minimizing the cost (3-19) for the worst possible disturbance of bounded power, 
viz., 

inf sup S{\\y(t)\\l v + \\u(t)\\l\ (7.6-4) 

K ( d ) ||Q(d)||<l L J 

In closed-loop we find 



" y(t) ' 




S(d) 


u(t) 




. - T (d) _ 



Q{d)n(t) (7.6-5) 
where 

S(d) := [L p + P(d)K(d)y 1 and T{d) := K{d)S{d) (7.6-6) 

are the sensitivity matrix and the power transfer matrix, respectively, of the feed- 
back system. Then, if C denotes the expectation in (4), we have 

C = Tr(Q*(d)S*(d)S(d)Q(d)) 

= \\S(d)Q(d)\\ 2 (7.6-7) 



where S(d) is the mixed sensitivity matrix 

S(d) :- 



(PyS(d) 

fuT(d) 



(7.6-8) 



with 



Thus, (4) becomes 



<Py<Py = and ^'uVu = ip u 



inf sup \\S(d)Q(d)\\ 2 = inf \\S(d)\\L (7-6-9) 
K ( d ) ||Q(d)||<l K ( d ) 

where 

||5(d)||oo := ess supa [S (e' 6 )] (7.6-10) 

0<9<2tt 

is the so-called H-infinity (Hoo) norm 1 of S(d). The equality in (9) can be proved 
as in [DV75]. We see that the Minimax LQ Stochastic Linear regulation problem 
amounts to finding compensator transfer matrices K(d) minimizing the 7ioo-norm 
of S(d), viz. the value of the frequency peak of the maximum singular value of 
5(eJ' e ), 9e [0,2tt). 

A discussion on how to solve (1)— (10) would lead us too much afield, our main 
interest being in indicating that robust stability can be systematically obtained by 
the steady-state LQ Stochastic Linear regulator for the worst possible disturbance 
case, or, cquivalcntly, for the worst possible dynamic weights in the cost to be 
minimized. To this end, we next show that (1)-(10) are equivalent to a determinis- 
tic minimax regulation problem which is used [Fra91] for systematically designing 



1 Hoo denotes the Hardy space consisting of all matrix- valued functions on (C which are analytic 
and bounded in the open unit circle [Pra87]. 
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robust compensators. This regulation problem is called the Woo Mixed Sensitivity 
regulation problem. Here the plant is given by 

y{t)=P{d)u{t)+ 1 {t) (7.6-11) 

where, in contrast with (1), the disturbance is represented by a vector-valued causal 
sequence 7 of finite energy 

ll7(-)H 2 = i>(fc)7'(fc)<^ 

fe=0 

The H.00 Mixed Sensitivity regulation problem, [VJ84], [Kwa85], is to find linear 
compensators (3) minimizing the cost 



J =Y,\\\y{k)\\% y + \Hk)\\l u ] (7-6-12) 



fe=0 

for the worst possible deterministic disturbance of bounded energy, viz. 

inf sup J (7.6-13) 

«-(«0|| 7 (.)||<i 

By the result in [DV75], this amounts again to finding K(d) so as to minimize the 
Woo-norm of S{d). 

We state these results formally in the following theorem. 

Theorem 7.6-1. The Minimax LQ Stochastic Linear regulation problem is equiv- 
alent to the Tioo Mixed Sensitivity regulation problem. 

Hoc optimal sensitivity problems were ushered in control engineering by [Zam81] 
in the early eighties in order to cope systematically with the robust stability prob- 
lem. The connection established in Theorem 1 is important in that it suggests that 
robust stability can be achieved in LQ Stochastic control by suitably dynamically 
weighting the variables in the cost to be minimized. Indeed, if we consider the 
SISO CARMA plant 

A(d)y(t) = B(d)u(t) + A(d)e(t) (7.6-14) 

and the cost 

C = £{ip y y}{t)+ip u u 2 f {t)} (7.6-15) 

where 

y f {t) := Q{d)y{t) and u f {t) := Q{d)u{t) 
with Q(d) a stable and stably invertible transfer matrix we see that the compensator 

u{t) = -K(d)y(t) 

solving, for the given CARMA plant, the minimax problem 

inf sup C 

K(d) \\Q{d)\\<\ 

coincides with the Hoc Mixed Sensitivity compensator for the deterministic plant 
A(d)y(t) = B{d)u(t). 



Sect. 7. 7 Predictive Control of CARMA Plants 



225 



Problem 7.6-1 Consider the Minimax LQ Stochastic Linear regulation problem when C is as in 
(3-64). Discuss this dynamic weighted version of the problem by introducing in C filtered variables 
y f (t) = W y {d)y{t) and u f (t) = W u {d)u{t). 

Main points of the section The Tioo Mixed Sensitivity compensators coincide 
with the ones solving the steady-state LQ Stochastic Linear regulation problem for 
the worst possible disturbance case, or, equivalently, for the worst possible dynamic 
weights in the cost to be minimized. 

7.7 Predictive Control of CARMA Plants 

Stochastic SIORHR 

We wish to extend Stabilizing I/O Receding Horizon Regulation (SIORHR) to SISO 
CARMA plants. SIORHR was introduced and discussed in Chapter 5 within a 
deterministic setting. Here we assume that the plant to be regulated is represented 
by a SISO CARMA model 

A{d)y{t) = B{d)u{t) + C{d)e{t) (7.7-1) 

where A(0) = C(0) = 1 and 

• [ B(d) C(d) ] is an irreducible transfer matrix; (7.7-2a) 

• C(d) is strictly Hurwitz; (7.7-2b) 

• the gcd of A(d) and B(d) is strictly Hurwitz. (7.7-2c) 

Except for strict Hurwitzianity of C{d), these assumptions are the same as in 
(3-2). Strict Hurwitzianity of C(d) is here adopted in that it simplifies SIORHR 
synthesis. Similarly to (3-3), we also assume that the innovations process e satisfies 
the following martingale difference properties 

£{e{t+ 1) | e*} = a.s. (7.7-3a) 

£ {e 2 (t + 1) | e*} = o\ > a.s. (7.7-3b) 

Finally, we assume that 

ordB(d) = l+£ (7.7-4a) 

viz., the plant exhibits a dcadtimc £ G TL + in addition to the intrinsic one. Conse- 
quently, (67. (5.4-22)) 

B(d) = d l B(d) (7.7-4b) 

with B(d) as in (5.4-22). 

It is convenient to introduce the filtered I/O variables 

7(*) - ^V(t) ,(*) := ^(t) (7.7-5) 

so as to represent (1) by the CAR model 

A(d)j(t) = B{d)v{t) + e(t). (7.7-6) 
We are now ready to formally state the SIORHR problem for CARMA plants. 
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Stochastic SIORHR Consider the CARMA plant (l)-(4) under the as- 
sumption that for all negative time steps the plant inputs u(k) have been 
measurable w.r.t. the cr-field generated by { / y k ,v k ~ 1 } 

u(k) G a^,^- 1 } (7.7-7) 

with a.s. bounded 7 , v~ x . Consider next the problem of finding, whenever 
it exists, an "open-loop" input sequence 

u(k) = f(k, 1 ,u- 1 )ea{ 1 °,u- 1 }, fc = 0,---,T-l, (7.7-8) 

minimizing the conditional expectation 

T-l 

-Y J £{^ y y 2 (k + t ) + ^ u u 2 (k) I7V- 1 } (7.7-9) 

k=0 

under the constraints 



u 



T 

T+n-2 



= O n -i £ {vlXi+n-i I 7°, v' 1 } = O n a.s. (7.7-10) 
Then, the feedback compensator 

w(i) = /(0, 7 t ,^- 1 ) (7.7-11) 
is referred to as the stochastic SIORHR with prediction horizon T. 

We next study how to solve the above problem along similar lines as in Sect. 5.5. 
It is known (Cf. Example 5.4-1) that (6) can be represented in state-space form by 
introducing the state-vector 

n a +i+nb — l 



s(t) := [ ( 7 *--+ 1 )' (vt[- nb+1 ) 

where n a = dA(d) and rib — dB(d) or I + = dB(d). In fact we have 

s(t+l) = <f>s{t) + Gv{t)+Le{t+l) 
7 (t) = Hs(t) 

with (&,G,H) given similarly to (5.4-3)-(5.4-5) and L = e„ a , e Ua being the n a -th 
vector of the natural basis of Ji n ^+ l + n >>- 1 _ The aim is now to construct a state- 
space representation for the initial CARMA plant (1). Note that, by (5), 

v(t) = u(t) — C\v{t — 1) — • • • — c nc v(t — n c ) 

= u(t) - [ c nc ■■■ ci ] v\ZT (7.7-12) 



and 



if 



y(t) = 7 (t) + ci7(t - 1) + ■ ■ ■ + c c 7(t - n c ) 

= [ c nc ■■■ Cl 1 ] 7t"" c (7-7-13) 



C(d) = l + Cl d+--- + Cn c d nc 
Extend the above state s(t) as follows 

8 c (t) := [ ^r" 7 )' (vt-i")' ]' eR " 1+ "" +1 (7.7-14a) 

rij := max(n a — l,n c ) n„ := max + n& — 1, n c ) (7.7-14b) 
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Problem 7.7-1 Verify that if the variables 7 and v are related by (6), the following state-space 
representation holds for the vector s c (t) in (14) 



where 



s c (t + l) = *sc(t) + Gv(t) + Le(t + 1) 

7(t) = Hs c (t) 

x 1 In** X n,y 

o 



(7.7-15) 



(»„-l)x(n 7 +2) J n„-1 



G = b T-^ n - l +i+^+n l ,+i L = e n ^+i H = e'„ y +i 
a n a +i = fti+n b +i = 0, i = 1,2, and e t denotes the i-th vector of the natural basis of 

Write (12) as 

i/(t) = u(t) - F c s c (t) F c := [ Oi x( „ 7+ i) c„„ ••• ci ] (7.7-16a) 
and (13) as 

y(t) = H c s c {t) H c := [ c„ T • • • a 1 lxril/ ] (7.7-16b) 

to get 



s c (*+l) = * c s c (i) + Gu(t) + Le(t + 
y(t) = ff c s c (t) 

$ c := $ - GF C 



(7.7-16c) 
(7.7-16d) 

This is a state-space representation for the initial plant description (1). We have 

y(k + £) = we+iu(k - 1) -\ \- w l+k u(0) + S l+k s c (0) + 

e(k + £) + ■■■ + g k +e-ie{l) 

where 

w k := H^^G 

is the fc-th sample of the impulse response associated with B(d) /A(d) 

S k := H c <S> k c 

and 



Now 



g k := H c <S> k L 

y(k + £) := £{y(k + £) | 7 V -1 } 

= w^ + iu(fc-l)H hffl« + iti(0) + S {+): s c (0) 

Further, since by (7) for k e S + , cr^ ,!/ -1 } C cr{e k }, for j/(fc) := - y(k) the 
conditional expectations 

£ {f(k + £) | 7 °, I/" 1 } = £ {£ {y 2 (fc + | e^- 1 } | 7 °, ^ } 

are by (3) a.s. constant for k > 1. Hence, the conclusion is that the optimal sequence 
ii[ 0T ) for the stochastic problem (7)— (11) is the same as if e(t + 1) = in (16c). 
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In particular, provided that n — n,n being the McMillan degree of B(d)/A(d), the 
SIORHR solution is given by (Cf. (5.5-21)) 

u° T _! = -M- 1 [tp y (I T - QLM- 1 ) W[T 1 + QT 2 ] s c (0) (7.7-17) 

where M, Q, L, W\, T\ and T 2 are now related to the system (16). Consequently, 
the SIORHR law equals 

u(t) = -ejM- 1 [tP y (I T - QLM- 1 ) W[T l + QT 2 ] s c (t) (7.7-18) 

Next theorem states the stabilizing properties of (18). 

Theorem 7.7-1. Let the CARMA plant (1) satisfy (2)-(4). Then, provided that 
ip u > the SIORHR law (18) stabilizes the CARMA plant whenever 

T>n = n (7.7-19) 
h being the McMillan degree of B(d)/A(d). Further, for 

T = n = h (7.7-20) 
(18) yields a closed-loop system whose observable-reachable part is state- deadbeat. 

Proof First, note that, because of (2), £ := (<E> C ,G, H c ) is stabilizable and detectable. Next, 
recalling the definitions of Ti and T2, it is seen that (18) is not affected by the unobscrvablc states. 
As a consequence, the unobservable eigenvalues of E, which are stable, are left unchanged by the 
feedback action. Let So be the observable subsystem resulting from any GK canonical observ- 
ability decomposition of S. Next, let us consider any GK canonical reachability decomposition of 
So 

H = [ H r Hr ] (7.7-21) 

with states x = [ x' r x' f ] ' and dim <E> r = ra, the McMillan degree of Bid) /A{d). The regulation 
law (18) can be written as u(t) = Fs c (t) = F r x r (t) + FrXf(t). Being <E>f stable, the closed 
loop system is stable if and only if <3? r + G r F r is a stability matrix. To prove this suppose 
temporarily that Xr(0) = 0. In such a case, for all k > 0, x r (k + 1) = Q r x r (k) + G r u(k) and 
y(k) = H r x r (k). Thus, by virtue of Theorem 5.3-2, <3? r + G r F r is a stability matrix. That, 
under (20), the observable-reachable part of the closed— loop system exhibits the state— deadbeat 
property follows by the above arguments and Theorem 5.3-2. 







<3?r 



G ■■ 



Stochastic SIORHC 

We extend SIORHC to SISO CARIMA plants. SIORHC was introduced and treated 
in Sect. 5.8 within a deterministic setting. For a discussion on the motivations for 
considering CARIMA plant models the reader is referred to Sect. 5. The plant to 
be controlled is therefore represented by the following CARIMA model (Cf. (5-26)) 

A(d)A(d)y(t) = B(d)Su(t) + C(d)e(t) (7.7-22) 

where A(d) := 1 — d and (2)-(4) hold true. We consider also a reference sequence 
r(-) which is assumed to be known by the controller at time t up to time t + l + T. 
We wish to address the following 2-DOF servo problem. 

Stochastic SIORHC Consider the CARIMA plant (22). Let (2)-(4) and 
Su(k) <G o~ {-f k , (5z^ fe_1 } hold true, and the reference be known £ + T steps in 
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advance. Find, whenever they exist, input increments Su t+T <G a {7* ,v l 
minimizing the conditional expectation 



i £ £{TP y e 2 y (k + t)+Tp u 5u 2 (k) I V,*!/*- 1 } (7.7-23a) 
fe=t 

e y (k) := y(k) - r(k) (7.7-23b) 

under the constraints 

*«££+»-2 = I 7*, ^ } = Il(t + * + T) a.S. 

(7.7-24) 

with 

r{k) := [ r{k) ••• r(k) feE" 
Then, the plant increment at time t given by SIORHC equals 

5u(t) = 5u(t) (7.7-25) 

It is straightforward to find for the solution of (23) and (24) 

5u\+t-i = -M-^y (It - QLM- 1 ) W{ [T lSc {t) - rJ+J+^J + 

Q[r 2 s c (t) -r{t + e + T)]} (7.7-26) 

provided that n < n, h being here the McMillan degree of A( ^ff( d ) ■ In (26) s c (t) 
is the same as in (14) except for the replacement of n a by n a + 1, and v\Zi" by 
8v\Z\" ■ Further, as in (14), all matrices are referred to the system (16). 

Theorem 7.7-2. Under the same assumptions as in Theorem 1 with A(d) replaced 
by A(d)A(d) and 

B(l) = B(l) ^ (7.7-27) 

the SIORHC law 

5u{t) = -e'iM" 1 !^ (I T - QLM- 1 ) W{ [T l8c (t) - r^+^J + 



Q 



r 2 s c (i) - r(t + £ + T)] } (7.7-28) 

inherits all the stabilizing properties of stochastic SIORHR, whenever 

T>n = h (7.7-29) 

h being the McMillan degree of B(d)/[A(d)A(d)]. Further, whenever stabilizing, 
SIORHC yields, thanks to its integral action, asymptotic rejection of constant dis- 
turbances, and an offset-free closed-loop system. 

Proof The stabilizing properties of (28) follow directly from Theorem 1. Asymptotic rejection of 
constant disturbances is a consequence of the presence of the integral action in the loop. Finally 
offset-free behaviour is proved as follows. First rewrite (28) in polynomial form as 8u(t) = 
— Ri(d)Su(t—l)— S(d)-y(t)+Z(d)r(t+£+T) or, after straightforward manipulations, as R(d)Su(t) = 
-S(d)y(t) + C(d)Z(d)r(t + i + T) with R(d) := C(d) + fli(d). Hence, if r(t) = r, we have 
limt^oo Su(t) = and y x := \im t ^ooy(t) = C^Z^S' 1 (l)r. That 5(1) = C(1)Z(1), and 
hence the closed-loop system has unit dc-gain, can be shown along the same lines as in (5.8-40)- 
(5.8-44) by replacing 1 by C(d) in the LHS of (5.8-43). 
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We conclude this section by pointing out that the extension of SIORHC to 
the stochastic case can be carried out by using formally the same equations as in 
deterministic case, provided that the state (5.8-32) be replaced with the C-filtered 
state 

s c (t):= [ (J*""-)' (SS-^y ]' (7.7-30) 

C(d)7(t) = y(t) C{d)5v{t) = 5u{t) 
Uj — max (n a , n c ) n v — max (£ + rib — 1, n c ) 
and the matrices in (28) be referred to the system (16). 

Problem 7.7-2 (GPC for C A RIM A plants) Consider GPC as in Sect. 5-8. Formulate a 2-DOF 
GPC servo problem for a CARIMA plant. Find the related GPC law. Compare this result with 
(5.8-52). 

Problem 7.7-3 (Stochastic SIORHC information pattern) Consider the SIORHC law (28) for 
the CARIMA plant (22). Show that it can be written in polynomial form as follows 

R(d)5u(f) = -S(d)y(t) + C(d)v(t) 
v(t) := Z(d)r(t + l + T) 

Compute the maximum values of the degrees of the polynomials R(d), S(d) and Z(d) in terms of 
dA(d), dB(d), dC(d) and t 

It is interesting to point out the strict connection which does exist between the 
LQ stochastic servo of Sect. 5 and SIORHC and GPC predictive controllers. In 
fact, whenever stabilizing, the latter approximate, as T — > oo for SIORHC and 
N\ = and N u — ► oo for GPC, the LQ stochastic servo behaviour. This can 
be concluded by comparing the performance indices that the above controllers 
minimize. The above connection makes it possible to extend to predictive control 
the considerations made at the end of Sect. 5 on the benefits that can be acquired 
by costing suitable filtered I/O variables. 

Problem 7.7-4 (Stochastic SIORHC and dynamic weights) For the CARIMA plant (22) consider 
the problem of finding input increments which minimize the conditional expectation 
, t+T-l 

- s{[W y (d)y(k+e)-r(k)] 2 + [W u (d)Su(k)] 2 \ 1 t ,Su t - 1 } 

k=t 

under the terminal constraints 

W u (d)8u(k) = k = t + T,---,t + T + h-2 

£ {W y (d)y(k + i) 7*,5i/- 1 } = r(t + £ + T) k = t + T, ■ ■ ■ , t + T + n - 1 

with n a positive integer, W u (d) = B u (d) / A u (d) and W y (d) = B y (d)/A y (d), where B u (d), A u (d), 
B y (d) and A y (d) are strictly Hurwitz polynomials. Show that the above problem reduces to the 
standard problem (23)-(25) once u and y are changed into 

8u f (t) := W u (d)8u(t) y f (t) := W y (d)y(t) 

and the plant (22) is replaced by 

A(d)A(d)A y (d)B u (d)y f (t) = B(d)A u (d)By(d)5u f (t) + C(d)B y (d)B u (d)e(t) 

From a more practical point of view, however, there are significant differences be- 
tween predictive controllers like SIORHC and GPC and steady-state LQ stochastic 
control. In fact, while predictive controllers of receding-horizon type are amenable 
to be extended to nonlinear plants or to embody constraints on state or I/O vari- 
ables, such requirements cannot be accommodated with acceptable computational 
load within steady-state LQ stochastic control. 

Main points of the section Predictive controllers, like SIORHC and GPC, 
can be extended to CARIMA plants with no formal changes into the design equa- 
tions, by simply modifying the plant representation as in (16) and filtering the I/O 
variables to be fed back by the inverse of the C(d) innovations polynomial. 



Notes and References 



231 



Notes and References 

LQ stochastic control is a topic widely and thoroughly discussed in standard text- 
books. Besides the ones referenced in Chapter 2, see also [AstTO], [AW84], [Cai88], 
[DV85], [FR75], [GJ88], [GS84], [May 79], [May82a], 

[May82b], [MG90]. 

The Certainty-Equivalence Principle first appeared in the economics literature 
[Sim56]. A rigorous proof of the Separation Principle for continuous-time LQG 
regulation was first given in [Won68]. 

The Minimum- Variance regulator was studied in [Ast70] and [Pet70], the lat- 
ter in an adaptive setting. The Generalized Minimum- Variance adaptive regulator 
was first presented in [CG75] and analysed in [CG79]. Steady-state LQ Stochas- 
tic regulation for ip u = 0, or Stabilizing Minimum- Variance regulation, was first 
addressed and solved by [Pet 72]. See also [SK86] and [PK92]. Steady-state LQ 
Stochastic Linear regulation was discussed in the monograph [Kuc79] . For an exten- 
sion to more general system configurations, see [CM91], [HSK91], [HKS92]. The 
material showing optimality of the Steady-state LQ Stochastic Linear regulator 
among possibly nonlinear regulators appears to be new. The monotonic properties 
of steady-state LQ Stochastic regulation were discussed in [MLMN92]. 

The approach to LQ stochastic tracking and servo discussed in Sect. 5 first 
appeared in [MZ89b]. See also [Gri90] and [MG92]. For an extension of this 
approach to an Hod setting, see [MCG90]. Unlike other relevant contributions, 
in [MZ89b] and [MCG90] the future of the reference realizations we are used in 
the controller. For SISO plants a similar idea was adopted in [Sam82] though in a 
state-space representation setting. 

7ioo control theory was ushered by [Zam81]. For a general overview, see the 
monographs [Fra87] and [FFH+91]. For an alternative approach see [LPVD83] and 
CDS!) . 

Sect. 7 on predictive control of CARMA plants improves on [CM92a]. 
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CHAPTER 8 



SINGLE-STEP-AHEAD 
SELF-TUNING CONTROL 



In this chapter we remove the assumption, which has been used so far, according 
to which a dynamical model of the plant is available for control design. We then 
combine recursive identification and optimal control methods to build adaptive 
control systems for unknown linear plants. Under some conditions, such systems 
behave asymptotically in an optimal way as if the control synthesis is made by 
using the true plant model. 

In Sect. 1 we briefly discuss various control approaches for uncertain plants 
and describe the two basic groups of adaptive controllers, viz. model-reference 
adaptive controllers and self-tuning controllers. Sect. 2 points out the difficulties 
encountered in formulating adaptive control as on optimal stochastic control prob- 
lem, and, in contrast, the possibility of adopting a simple suboptimal procedure 
by enforcing the Certainty Equivalence Principle. Sect. 3 presents some analytic 
tools for establishing global convergence of deterministic self tuning control sys- 
tems. In Sect. 4 we discuss the deterministic properties of the RLS identification 
algorithm that typically are not subject to persistency of excitation and, hence, 
applicable in the analysis of self-tuning systems. In Sect. 5 these RLS proper- 
ties are used so as to construct a self-tuning control system based on the Cheap 
Control law for which global convergence can be established in a deterministic set- 
ting. Sect. 6 discusses a constant-trace RLS identification algorithm with data 
normalization and extends the global convergence result of Sect. 5 to a self-tuning 
Cheap Control system based on such an estimator. The finite memory-length of 
the latter is important for time-varying plants. Self-tuning Minimum -Variance 
control is discussed in Sect. 7 where it is pointed out that implicit modelling of 
CARMA plants under Minimum Variance control can be exploited so as to con- 
struct self tuning Minimum- Variance control algorithms whose global convergence 
can be proved via the stochastic Lyapunov equation method. Sect. 8 shows that 
Generalized Minimum- Variance control is equivalent to Minimum- Variance control 
of a modified plant, and, hence, globally convergent self-tuning algorithms based 
on the former control law can be developed by exploiting the above equivalence 
and the results in Sect. 7. Sect. 9 ends the chapter by describing how to robustify 
self tuning Cheap Control to counteract the presence of neglected dynamics. 

We point out that all the results of this chapter pertain to single-step-ahead, or 
myopic, adaptive control. For this reason, applicability of these results is severely 
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limited by the requirements that the plant be minimum-phase and its I/O delay 
exactly known. Nonetheless, the study of these adaptive controllers is important 
in that introduces at a quite basic level ideas which, as will be shown in the next 
chapter, can be effectively used to develop adaptive multistep predictive controllers 
with wider application potential. 

8.1 Control of Uncertain Plants 

In the remaining part of this book we shall study how to use control and identi- 
fication methods for controlling uncertain plants, viz. plants described by models 
whose structure and parameters are not all a priori known to the designer. This 
is a situation virtually always met in practice. It is then of paramount interest to 
approach this issue by using the tools introduced in the previous chapters where, 
apart from a few exceptions, we have permanently assumed that an exact plant 
representation — either deterministic or stochastic — is a priori available. 

In practice, we meet many different situations that can be referred to under 
the "uncertain plant" heading. If it is known that the plant behaves approxi- 
mately like a given nominal model, robust control methods can be used to design 
suitable feedback compensators (Cf. Sect. 3.3, 4.6, and 7.6). In other cases, the 
plant may exhibit significant variations but auxiliary variables can be measured, 
yielding information on the plant dynamics. Then, the parameters of a feedback 
compensator can be changed according to the values taken on by the auxiliary vari- 
ables. Whenever these variables provide no feedback from the actual performance 
of the closed-loop system which can compensate for an incorrect parameter set- 
ting, the approach is called gain scheduling. This name can be traced back to the 
early use of the method finalized to compensate for the changes in the plant gain. 
Gain scheduling is used in flight control systems where the Mach number and the 
dynamic pressure are measured and used as scheduling variables. 

Adaptive control mainly pertains to uncertain plants which can be modelled as 
dynamic systems with some unknown constant, or slowly time- varying, parame- 
ters. Adaptive controllers are traditionally grouped into the two separate classes 
described hereafter. 

Model-Reference Adaptive Controllers (MRACs) In a MRAC system 
(Fig. 1) the specifications are given in terms of a reference model which indicates 
how the plant output should respond ideally to the command signal c(t). The 
overall control system can be conceived as if it consists of two loops: an inner 
loop, the ordinary control system, composed of the plant and the controller; and 
an outer loop which comprises the parameter adjustment or tuning mechanism. 
The controller parameters are adjusted by the outer loop so as to make the plant 
output y(t) close to model output y m (t). 

Self-Tuning Controllers (STCs) In a STC system (Fig. 2) the specifica- 
tions are given in terms of a performance index, e.g. an index involving a quadratic 
term in the tracking error e(t) = y(t) —r(t) between the plant output and the output 
reference plus an additional quadratic term in the control variable u(t) or its incre- 
ments Su(t). As in a MRAC system, there are two loops: the inner loop consists 
of the ordinary control system and is composed by the plant and the controller; 
the outer loop consists of the parameter adjustment mechanism. The latter, in 
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Figure 8.1-1: Block diagram of a MRAC system. 
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Figure 8.1-2: Block diagram of a STC system. 
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turn, is made up by a recursive identifier, e.g. an RLS identifier, (Cf. Sect. 6.3) 
and a design block, e.g. a steady-state LQS tracking design block (Cf. Sect. 7.5). 
The identifier updates an estimate of the unknown plant parameters according to 
which the controller parameters are tuned on-line by the design block. The control 
problem which is solved by the design block is the underlying control problem. If 
the identifier attempts to explicitly estimate an (open loop) plant model, e.g. a 
CARMA model, required for solving the underlying control problem, e.g. a steady- 
state LQS tracking problem, the scheme is referred to as an explicit or indirect 
STC system. In contrast with the explicit scheme, some STCs do not attempt 
to explicitly identify the plant model required for solving off-line the underlying 
control problem. On the contrary, the tuning mechanism is designed in such a 
way that self-tuning occurs thanks to identification in closed-loop of parameters 
that are relevant for solving on-line the underlying control problem. Typically, 
in such a case combined spread-in-time iterations of both the identifier and the 
design block take place to yield at convergence the controller parameters solving 
the underlying control problem. A celebrate example of such a scheme is the origi- 
nal self tuning Minimum Variance controller of Astrom and Wittenmark [AW73] . 
Here an RLS algorithm identifies a linear regression model relating in closed-loop 
the inputs and the outputs of a CARMA plant, and a Minimum- Variance control 
design (Cf. Sect. 7.3) is carried out at each time-step as if the plant coincides with 
the currently estimated CAR model. These STC schemes which do not explicitly 
identify the (open-loop) plant model are referred to as implicit STC systems. While 
in indirect adaptive 2-DOF control systems the feedforward law is computed from 
the estimated plant parameters in accordance with the underlying control law, in 
some implicit adaptive schemes the feedforward law is estimated by the identifier 
itself in a direct or almost direct way (Cf. 2-DOF MUSMAR in Sect. 9.4). This 
possibility is accounted in Fig. 2 where also the output reference enters the iden- 
tifier. An extreme case within implicit STC systems are the direct STCs whereby 
the controller parameters arc directly updated via the recursive identifier. In such 
a case the block labelled "Design" in Fig. 2 disappears. Direct STCs can be ob- 
tained when the underlying control law is such that it allows one to reparameterize 
in closed-loop the model relating the plant inputs and outputs in terms of the 
controller parameters. 

A closer comparison between Fig. 1 and Fig. 2 reveals the existence of a strong 
connection between MRACs and STCs. Their basic difference, in fact, resides in 
the block labelled "Model" present in Fig. 1 and absent in Fig. 2. Further, if in 
Fig. 2 we let r(t) = M(d)c(t) =: y m (t), with M(d) a stable transfer matrix, we see 
that the STC system, provided that it makes the tracking error e(t) = y(t) —r(t) = 
y(t) — y m (t) small, basically solves the same problem as in MRAC. In fact, under 
the above choice, the STC system tends to make its output response y(t) to the 
command input c(t) close to the desired model output y m (t). Therefore, it follows 
that MRAC and STC systems need not differ for either their ultimate goals or 
their implcmentative architectures. Moreover, the fact that originally MRACs have 
been developed mainly as direct adaptive controllers for continuous-time plants, 
whereas the majority of STCs were introduced as schemes for discrete-time plants, 
can be considered an accidental fact. In fact, there are indirect discrete-time 
MRACs as well as continuous-time STCs. Consequently, we conclude that the 
distinction between MRACs and STCs can be properly justified on the grounds of 
their different design methodologies. 
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In MRAC there are basically three design approaches: the gradient approach; 
the Lyapunov function approach; and the passivity theory approach. The gradient 
approach is the original design methodology of MRAC. It consists of an adaptation 
mechanism which in the controller parameter space proceeds along the negative 
gradient of a scalar function of the error y(t) — y m (t). It was found out that the 
gradient approach does not always yield stable closed-loop systems. This stimu- 
lated the application of stability theory viz. Lyapunov stability theory and passivity 
theory so as to obtain guaranteed stable MRAC systems. 

The design methodology of the STCs basically consists of minimizing on-line 
a quadratic performance criterion for the currently identified plant model. This 
approach is therefore more akin to LQ and predictive control theory as presented 
in the previous chapters of this book. For this reason, in the subsequent part of 
this book we shall concentrate merely on STCs. 

Simple controllers are adequate in many applications. In fact, three-terms com- 
pensators, e.g. discrete-time PID controllers, generating the plant input increment 
8u(t) := u(t) — u(t — 1) in terms of a linear combination of the three most recent 
tracking errors e(t), e(t — 1), e(t — 2), e(t) := y{t) — r(t), are ubiquitous in in- 
dustrial applications. Such controllers are traditionally tuned by simple empirical 
rules using the results of an experimental phase in which probing signals, such as 
steps or pulses, are injected into the plant. This way of setting the tuning knobs 
of a PID controller is called auto-tuning. Auto-tuners can be obtained [AW89] by 
using rules based on transient responses, relay feedback, or relay oscillations. Auto- 
tuning is generally well-accepted by industrial control engineers. In fact, typically 
the auto-tuning phase is started and supervised by a human operator. The prior 
knowledge on the process dynamics is then allowed to be poorer and the "safety 
nets" simpler than when the controller parameters are adapted continuously as in 
STC systems. Many adaptive control methods can be used so as to develop efficient 
auto-tuning techniques for a wide range of industrial applications. Auto tuning is 
then a practically important application area of adaptive control techniques. 

Other design methodologies for uncertain plant control which deserve to be 
mentioned arc variable structure systems and universal controllers. In variable 
structure systems, [Eme67], [Itk76], [Utk77], [Utk87], [Utk92], the controller forces 
the closed-loop system to evolve in a sliding mode along a sliding or switching 
surface, chosen in the state space. This can yield insensitivity to plant parameter 
variations. Drawbacks of variable structure systems are the choice of the switching 
surface, the chattering associated to the sliding modes, and the required measure- 
ment of all the plant state variables. 

Universal controllers [Nus83], [Mar85], [MM85], [WB84], have a structure which 
does not explicitly contain any parameter related to the plant. Hence, they can be 
used "universally" for any unknown linear plant for which it is known to exist a 
stabilizing fixed-gain controller of a given order. One drawback of universal con- 
trollers is that they are liable to exhibit very violent transients after the operation 
is started. 

We have so far intentionally avoided to define what is meant by adaptive control. 
This is a quite difficult task. In fact, a meaningful and widely accepted definition, 
which would make it possible to look at a given controller and decide whether it 
is adaptive or not, is still lacking. As it emerges from the above description of 
MRACs and STCs, we adhere to the pragmatic viewpoint that adaptive control 
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consists of a special type of nonlinear feedback control system in which the states 
can be separated into two sets corresponding to two different time scales. Fast 
time-varying states are the ones pertaining to ordinary feedback (inner loop in 
Fig. 1 and Fig. 2); slow time-varying states are regarded as parameters and consist 
of the estimated plant model parameters or controller parameters (outer loop in 
Fig. 1 and Fig. 2). This implies that linear time-invariant feedback compensators, 
e.g. constant-gain robust controllers, are not adaptive controllers. We also assume 
that in an adaptive control system some feedback action exists from the perfor- 
mance of the closed-loop system. Hence, gain scheduling is not an adaptive control 
technique, since its controller parameters are determined by a schedule without any 
feedback from the actual performance of the closed-loop system. 

Main points of the section There are several alternative approaches to the 
control problem of uncertain plants. They include robust control, gain scheduling, 
adaptive control, variable structure systems. In practice, the appropriate choice of 
a specific approach is dictated by the application at hand. Adaptive control, which 
is traditionally subdivided into MRACs and STCs, becomes appropriate whenever 
plant variations are large to such an extent as to jeopardize the stability or reduce to 
an unacceptable level the performance of the system compensated by nonadaptive 
methods. 



8.2 Bayesian and Self— Tuning Control 

It would be conceptually appealing to formulate the adaptive control problem as an 
optimal stochastic control problem. We illustrate the point by discussing a simple 
example. 

Example 8.2-1 (Bayesian formulation) Consider the SISO plant 

y(k) = 6u(k - 1) + <(fc) (8.2-1) 

k £ /Zi , where 9 is an unknown parameter and £ is a white Gaussian disturbance with mean zero 
and variance a 2 , £ ~ N(0, a 2 ). The goal is to choose uy t ^ t+T ^, u(k) g a {y k }, so as to minimize 
the performance index 

c = £ \i E iy( k )- r ( k )n (s-2-2) 

[ k=t+l J 
where r is a given reference. One way to proceed is to embed (1)— (2) in a stochastic control 
problem. This can be done by modelling the unknown parameter 9 as a Gaussian random variable 
independent on £ with prior distribution ~ N (0rj, cf ) where 9q is the nominal value of 9. Under 
such an assumption, we can rewrite (1) as follows 

9(k + l) = 9(k) , 0(l)~JV(flo,ffJ) 

y(k) = u(k - l)0(fc) + C(fc) [ ' 

This is a nonlinear dynamic stochastic system with state 9(k) and observations y(k). Then, 
(2)-(3) is a stochastic control problem with partial state information. For such a problem the 
conditional or posterior probability density function p(9(k) \ y k ~ 1 ) of 9(k) given the observed 
past (y k ~ 1 , u k ~ 2 ) , or cquivalcntly y k ~ 1 , is Gaussian with conditional mean 9(k) := 9(k | k — 1) 
and conditional variance (r|(fe) = £ \j> 2 (k) \ y k ~ 1 J, 9(k) := 9(k) - 9(k), 

p{e(k)\y k - 1 ) =N{e(k),a 2 (k)) 

The last two quantities can be computed in accordance to the conditionally Gaussian Kalman 
filter (Cf. Fact 6.2-1). We get 

§(k + l) = 6(k)+ a ^ {k ^%~ 1 } 2 \y(k)-u(k-l)9(k)} (8.2-4a) 
u^(k — l)o-g(k) + cH L J 
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[k+l,k+N] 



Nonlinear 
control law 



u{k) 



Nonlinear 
filter 



Plant 



y(t) 



Figure 8.2-1: Block diagram of an adaptive controller as the solution of an optimal 
stochastic control problem. 



of(k + l) 



a 2 (k)- 



u 2 (k-l)a%(k) 
u 2 (k-l)al(k) + a 



with 0(1) = 6 and a 2 (l) = a 2 . Further we can write 



where i/(k) 



y(k) = u(k - l)e(k) + v(k) 
u(k — l)9(k) + £(fc) conditionally on is Gaussian: 

u(k) ~ N (0,u 2 (k - l)a 2 (k) + cr 2 ) . 



1.2-4b) 



(8.2-4c) 



In (4) the vector xO) := [ 0{k) a 2 (k) ] 6 R 2 can be regarded as a state, which makes it 
possible to update p( y 6(k) \ y k ~ 1 ) and express the observations as in (4c). The vector x(k) is 
called the hyperstate. The optimal stochastic control problem (2) and (4) has been reduced to a 
complete state information problem. It can be solved via Stochastic Dynamic Programming using 
Theorem 7.1-1. However, the system (4) is nonlinear, and no explicit closed form for the optimal 
control law can be obtained, except for the T = 1 case. The latter is called a myopic controller, 
since it is short-sighted and looks only one— step— ahead. For T = 1 we have 

£ {[y(t + l)-r(t + l)] 2 } = £{e {[y(t + l)-r(t + l)] 2 |y«}} 

= £ I [u(t)0(t + 1) - r(t + 1)] 2 + u 2 (t)a 2 (t + 1) • 
Minimization w.r.t. u(t) yields the one-step-ahead optimal control law 



u(t) = 



0(t + l) 



-r(t + l) 



6' 2 (t + l) + a 2 (t + l) 

Note that if a 2 = 0, i.e. we know a priori that 8 equals its nominal value i 
0(t + l) = 6» , and 

u(t) ^r(t+l)_r( t + l) 



(8.2-5) 

3, we have a 2 (t + l) = 0, 
(8.2-6) 



0(t + 1) 0o 

The one-step-ahead controller (5) is sometimes called cautious [AW89] since, as its comparison 
with (6) shows, it takes into account the parameter uncertainty. 

Although in the multistcp case T > 1 the optimal stochastic control problem (2) and 
(4) admits no explicit solution, Example 1 leads us to make the following general 
remarks. First, since the hyperstate x(^) i s an accessible state of the stochastic 
nonlinear dynamic system to be controlled, u(k) is expected to be a nonlinear 
function of Fig. 1 illustrates the situation. Second, by (4b), the choice of 

u(k) influences <r 2 (k + 2), the posterior uncertainty on 6 based on y k+1 . Thus, it 
might be advantageous to select large values of u 2 (k) so as to reduce a 2 (k + 2). 
On the other hand, this has to be balanced against the disadvantage of increasing 
[y(k + 1) - r(k + l)] 2 . Thus, the control here has a dual effect (Cf. Sect. 7.2). 
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As a consequence, the control problem of Example 1 is an excerpt of dual control 
[AW89]. Since its solution is based on assigning prior distributions to the unknown 
parameters and on-line computation of posterior distributions from the available 
observations, the optimal stochastic control problem formulated in Example 1 is 
also referred to [KV86] as a Bayesian adaptive control problem. 

Apart from the above general important hints, Example 1 points out the diffi- 
culties of resorting in adaptive control to optimal stochastic control theory, except 
for the myopic control case. Since the latter has ultimately all the limitations in- 
herent to cheap control and single step regulation (Cf. Sect. 2.6 and 2.7), optimal 
stochastic control theory is of little practical use in adaptive control. This is one 
of the reason why most of the times suboptimal approaches are adopted. A very 
popular approach to adaptive control is the one described in the example which 
follows. 

Example 8.2-2 (Enforced Certainty Equivalence) Consider again the plant (1) where £ is a 
zero-mean white disturbance, and 8 is a nonzero constant but unknown parameter for which no 
prior probability density is assigned or assumed. The aim is to choose u(k) 6 <r{t/ fe } so as to 
make y(k) ta r(k) = r. If we know 8, according to (6) we could simply choose 

u(t) = T - (8.2-7) 

This, which is the Minimum- Variance (MV) control law (Cf. Theorem 7.3-2), minimizes 

C = e[[y(t + l)-r] 2 } (8.2-8) 

for every t. When 8 is unknown we can proceed by Enforced Certainty Equivalence: we estimate 
on-line 8 via LS estimation 



8(t) 

and we set at each 1 g Si 



fc=0 



u(k)y(k + 1) (8.2-9) 

fc=0 



u(t) = — (8.2-10) 
v ; 8(t) K ' 

with u(0) 7^ 0. In other terms, to compute the control variable we use the current estimate 8(t) 
as it were the true parameter 8. 

The controller (9)-(10), is an adaptive controller of self-tuning type for the plant 
(1) and the performance index (8). Enforced Certainty Equivalence (ECE) is a sim- 
ple procedure for designing adaptive controllers. ECE-based adaptive controllers 
compute the control variable by solving the underlying control problem using the 
current estimate 9(t) of the unknown parameter vector 9 as if 9(t) were the true 9. 
If the adaptive controller achieves the same cost as the minimum which could be 
achieved if 9 was a priori known, we say that the adaptive system is self-optimizing. 
Whenever the control law asymptotically approaches the one solving the underlying 
control problem, we say that self-tuning occurs. Further, we say that the adaptive 
controller is weakly self-optimizing, and/or that w.s. (weak sense) self-tuning oc- 
curs, if the above properties hold under the assumption that the adaptive control 
law converges. 

Example 8.2-3 Consider again the self-tuning controller (9)-(10), applied to the plant (1). Let 
( be a possibly non-Gaussian zero-mean white disturbance satisfying the following martingale 
difference properties 

£{((t + 1) | C*} = , a.s. (8.2-lla) 
£{( 2 (t + l) | C*} = o- 2 , a.s. (8.2-llb) 
The following result is useful to study the properties of the adaptive system. 
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Result 8.2-1 ([LW82] Martingale local convergence). Let {C(k),^Fk} be a martingale dif- 
ference sequence such that 

sup£{C 2 (fc + l) \F k ) < oo a.s. (8.2-12) 

k 

Let u be a process adapted to Tk, i.e. u(k) S Tk- Then 

oo oc 

E u 2 (k) < oo => E u(k)((k + 1) converges a.s. (8.2-13a) 
fc=0 fc=0 

oo t-1 / t_1 \ 

E u 2 (k) = oo => E u(fe)C(fc+ 1) = o E « 2 (fc) a.s. (8.2-13b) 

fc=0 fe=0 \fc=0 / 

To apply this result, set ^ := a {£ fc }. By induction we can check that u(fc) £ .7-);. Further, 
(lib) implies (12). Substituting (1) into (9) we get 



e(t) 



£« 2 (fc) 



E u(fc)C(fc + 1) 



(8.2-14) 



We next show that J]fcLo u2 (^) = 00 a - s - 111 fact, we have a.s. 



E M 2 (fc) < oo 
k=0 



t-1 

E u(k)((k + 1) converges 
fc=0 

t-i i _1 t-i 



E« 2 ( fc ) 

fc=0 



E u(k)((k + 1) converges 
fc=0 



(9(t) converges 
lim|u(t)| > = 



Irl 

■ lim -LJ- > 

oo 

E u 2 (k) = oo 
fc=0 



Hence J]fcLo u2 (k) = 00 a - s - I* then follows from (13b) and (14) that 

lim 6(t) = 9 a.s. 

t— *oo 



(8.2-15) 



Therefore, in the adaptive controller (9)— (10), applied to the plant (1), self-tuning occurs. 
In order to see if the adaptive control system is self-optimizing, we write 



7 E [f ( fe ) - 



fc=l 



-E[Mfe-l) + CW-r-] 2 

c fc=l 
1 * 

- E { [M* - !) - r ? + C 2 (fe) - 2<(fc) + 2du(k - l)C(fc)} 



Now, since self— tuning occurs: 

• [0u(fc - 1) - r] -> 0; 

t 

• E <( fc ) = °w b y ( 13b )> 

t / t 

• e u ( k - = ° I E " 2 ( fc ) ) and E " 2 ( fc ) = °(*) bccausc ° f ( 15 )- 

k=l \k=l 
We can then conclude that 



k=i 



k=i 



as t — > oo. If we adopt the additional assumption 

f{C 4 (* + l) IC*} = M<oo 



1.2-16) 



(8.2-17) 
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by Lemma D-2 in the Appendix D we find from (16) 




a.s. 



(8.2-18) 



On the other hand, by the discussion leading to (16) we also see that if 8 is known the minimum 
cost C equals the R.H.S. of (18). Then, we conclude that, under (17), the adaptive system is 
self-optimizing for the cost C. Note that self-optimization cannot be claimed for the cost (8) for 
any finite t, since only the asymptotic behaviour of the adaptive system can be analysed. 

Problem 8.2-1 Prove that under all the assumptions used in Example 3 the adaptive control 
system (1), (9) and (10) is self optimizing for the asymptotic MV cost 



Main points of the section A systematic optimal stochastic control approach 
based on a Bayesian reformulation of nonmyopic adaptive control problems leads to 
dual control. This is however so awkward to compute that the systematic approach 
turns out to be of little practical use. Enforced Certainty Equivalence (ECE) is a 
nonoptimal but simple procedure for designing adaptive controllers. ECE-based 
adaptive controllers exist for which, under given conditions, self tuning and self- 
optimization occur. 



8.3 Global Convergence Tools for Deterministic 



In this section we present the main analytic tools that have been originally used 
[GRC80] to establish some desirable convergence properties of adaptive cheap con- 
trollers, viz. deterministic STCs whose underlying control problem is Cheap Control 
(Cf. Sect. 2.6). One reason for an in depth study of these tools is that, as will be 
seen in the next chapter, they can be extended to analyse asymptotic properties 
of multistep predictive STCs as well. By "convergence" we mean that some of 
the control objectives are asymptotically achieved and all the system variables re- 
main bounded for the given set of initial conditions. Some of the reasons for which 
convergence theory is important are listed hereafter: 

• A convergence proof, though based on ideal assumptions, makes us more 
confident on the practical applicability of the algorithm; 

• Convergence analysis helps in distinguishing between good and bad algo- 



• Convergence analysis may suggest ways in which an algorithm might be im- 



For these reasons, there has been considerable research effort on the question of 
convergence of adaptive control algorithms. However, the nonlinearity of the adap- 
tive control algorithms has turned out to be a major stumbling block in establish- 
ing convergence properties. In fact, taking into account that the algorithms are 
nonlinear and time- varying, it is quite surprising that convergence proofs can be 
obtained at all. Faced by the complexity of the convergence question, researchers 
initially concentrated on algorithms for which the control synthesis task is simple, 




STCs 



rithms; 



proved. 
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viz. single-step-ahead STC systems. Even for this simple class of algorithms, con- 
vergence analysis turned out to be very difficult. It took the combined efforts of 
many researchers over about two decades to solve the convergence problem for the 
single-step-ahead STC systems at the end of the seventies. 

We assume that the plant to be controlled with inputs u(k) £ R is exactly 
represented as in (6.3-1). Hence, similarly to (6.3-2), for k £ 7L\ 

y(k) = ip'(k-l)e (8.3-la) 

V {k -!):=[ {-ytzlS (utl b )' } ' € R* (8.3-lb) 

6:= [ oi-"Oft h---b Ab ] (8.3-lc) 

with fig := h a + hb- Here fi a and fib denote two known upper bounds for n a — 
dA°{d) and, respectively, n& = dB°(d), dA°(d) and dB°(d) being the degrees of 
the two coprime polynomials in the irreducible plant transfer function B°(d)/A°(d), 
^4°(0) = 1. In general, the vector 9 in (1) is not unique. In fact, B(d)/A(d) := 
p(d)B (d)/[p(d)A°(d)} = B°(d)A°(d) where p(d) is intended here to be any monic 
polynomial with dp(d) < mm(n a —n a ,hb—nb) =: v. To any such a pair (A(d), B(d)) 

A(d) = l + a 1 d+--- + a na+v d n - +v =p(d)A°(d) 
B(d) = b 1 d+--- + b ni+v d nb+v =p{d)B°{d) 

we can associate the parameter vector 9 £ H ne , 9 <~ (A(d), B(d)), 

[ai ■■■<!««, h---b nb+v (v = n a -n a ) 

n b -n b -v 

[ai---a„ a+v 0---0 bi---bn b ] (v = n b - n b ) 



In particular, 9° - (A°(d), B°(d)) if 

0°=[ al---a° na 6?---6° 6 

n a — n a — v fib — nb — v 

The set 6 of all vectors 9 satisfying (1) consists of the linear variety [Lue69], or 
affine subspace, in R n " , 

Q = 9° + V (8.3-2) 

where V is the ^-dimensional linear subspace parameterized, according to the 
above, by the v free coefficients of the polynomial p(d). will be referred to 
as the parameter variety. Despite the non-uniqueness of 9 in (1), standard re- 
cursive identifiers, like the Modified Projection and the RLS algorithm, enjoy the 
following set of properties. 

Properties PI 

• Uniform boundedness of the estimates 

\\6{t)\\ <M g <oo , Vte^i (8.3-3) 
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Vanishing normalized prediction error 



e 2 (t) 

lim „ ; ; = (8.3-4) 

t^oo 1 + C \\<p (t-1 2 V ' 



for some c > 0. 



Here 9{t) denotes the estimate based on the observations y* := {y{k)Y k=1 and 
regressors </? t_1 := {ip(k — l)} k=1 , and e(i) := y(t)—ip'(t—l)6(t—l) is the prediction 
error. 

Property (3) is essential for STC implementation. In fact, boundedness of {0(t)} 
is necessary to possibly compute the controller parameters at every t. On the other 
hand, (3) is not sufficient to design a controller with bounded parameters. As the 
next example shows, difficulties may arise from possible common divisors of A(t, d) 
and B(t,d). 

Example 8.3-1 (Adaptive pole-assignment) Consider a STC system wherein the underlying 
control problem is pole-assignment. Let 8(t) ~ (A(t, d), B(t, d)) be the plant parameter esti- 
mate at t. The corresponding pole-assignment controller should generate the next input u(t) in 
accordance with the difference equation 

R{t,d)u(t) = -S(t,d)e y (t) (8.3-5) 

Here e y (t) := y(t) — r(t) is the tracking error, {r(t)} being an assigned reference sequence, and 
R(t, d) and S(t, d) arc polynomials solving the following Diophantinc equation for a given strictly 
Hurwitz polynomial Xcl(d) 

A(t, d)R(t, d) + B(t, d)S(t, d) = Xcl (d) (8.3-6) 

Here Xcl (d) equals the desired closed-loop characteristic polynomial. Let p(t, d) be the GCD of 
A(t, d) and B(t, d). Then, from Result C.l we know that (6) is solvable if and only if p(t, d) \ Xcl(d). 
In practice, (6) becomes ill-conditioned whenever any roots of A(t, d) approaches a root of B(t, d) 
which is far from the roots of x c i(d). In such a case, the magnitude of some coefficients of R(t, d) 
and S(t, d) becomes increasingly large. 

Next problem points out that in adaptive Cheap Control the underlying control 
design is always solvable irrespective of the GCD of A(t, d) and B(t,d), provided 
that &i(i) ^ 0. 

Problem 8.3-1 (Adaptive Cheap Control) Recall that given the plant parameter vector 9(t) ~ 
(A(t,d), B(t,d) with b\(t) ^ 0, in Cheap Control the closed-loop ti-caracteristic polynomial 
equals Xcl(t> d) = B(t, d)/[bi(t)d]. Check then that in such a case (6) is always solvable, and find 
its minimum degree solution (R(t,d), S(t,d)). 

The above discussion shows that some provisions have to be taken in STCs different 
from adaptive Cheap Control in order to make the underlying control problem 
solvable at each finite t. Further, in order to insure successful operation as t — > 
oo in some adaptive schemes the following additional identifier properties become 
important. 

Properties P2 

• Slow asymptotic variations 



Convergence 



lim \\0(t) - 0(t - k)\\ = , Vfce^i (8.3-7) 



lim 6(t) = 6*00 e R™" (8.3-8) 

t— s-OO 
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• Convergence to the parameter variety 

A min [P~\t)] =00 =► e 6 (8.3-9) 

• Linear boundedness condition 

\\<p(t - 1)|| < ci + c 2 max |e(fc)| (8.3-10) 

[l jt] 

< Ci < oo, < c 2 < oo. 
It is to be pointed out that (7) does not imply that {0(t)} is a Cauchy sequence 
and, hence, that it converges. E.g., 6(t) = sin ( ioin(t+i) ) satisfies (7) but does not 
converge. While property (7) holds true for both the Modified Projection algorithm 
and RLS, the stronger properties (8) and (9) hold only for RLS. In (9) -P _1 (£) 
denotes the positive definite matrix given by (6.3-13g) for RLS. Another crucial 
property that, in addition to PI and possibly P2, is required to establish global 
convergence of STC systems is (10). As will be seen, in adaptive Cheap Control 
(10) is satisfied if the plant is nonminimum phase, while in other schemes, like 
adaptive SIORHC, it follows from the underlying control and the other properties 
in P2. 

The use of (4) and (10) in the analysis of STC systems is based on the following 
lemma. 



Lemma 8.3-1. [GRC80] (Key Technical Lemma) If 

e 2 (t) 

lim — --tp^t —7^ = (8.3-11) 

*^oo o;(t) + c(i)||^(i - 1)|| 2 v ; 

where {ct(t)}, {c(t}} and {s(t)} are real-valued sequences and {<p(t — 1)} a vector- 
valued sequence; then subject to: the uniform boundedness condition 

< a(t) < K < oo and < c(t) < K < oo (8.3-12) 

for all t e TL\, and the linear boundedness condition (10), it follows that 

\\ip(t - 1)|| is uniformly bounded (8.3-13) 

for igli and 

lim e(t) = (8.3-14) 

Proof If \e(t)\ is uniformly bounded, from (10) it follows that ||v?(t— 1)|| is uniformly bounded as 
well. Then, (14) follows from (11). By contradiction, assume that |e(t)| is not uniformly bounded. 
Then, there exists a subsequence {t n } such that limt^-^oo |e(tn)| = oo and \e(t)\ < \e(t n )\ for 
t < t„. Now 

Htn)\ . \e(t n )\ 



[a(t) + c(t)\\ v (t n - IW] 1 ' 2 [K + K\Mt n - l)|p] V2 

Htn)\ 



KV* + K^Mtn -1)|| 

\e(t n )\ 

K^/i+K 1 / 2 [ci + c 2 |e(t„) 



[(10)] 



Hence 



lim ^ -> _L_ >0 

[a(t) + c(t)\Mt n - 1)P]^ 2 ~ K^c 2 



This contradicts (11). Hence |e(t)| must be uniformly bounded 
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Lemma 1 presupposes that \e(t) \ and ||</?(t— 1)|| are bounded for every finite t € 7L\. 
As long as we consider a STC system as in Fig. 1-2 where the plant and the 
controller are linear, and the tuning mechanism guarantees boundedness of the 
controller parameters at every ie^i, there is no chance of having a "finite escape 
time" . Hence, Lemma 1 is applicable to STC systems, even if they are highly 
nonlinear, to show that their variables remain bounded as t — > oo. 

Main points of the section In a STC system it is required that the tuning 
mechanism provides the controller with bounded parameters. This must be insured 
jointly by the identifier the underlying control law. Once this is accomplished, no 
finite escape time is possible, and uniform boundedness of all the involved variables 
can be established by using the Key Technical Lemma along with the asymptotic 
properties of the identifier. 



8.4 RLS Deterministic Properties 

We next derive some properties of the RLS algorithm, like PI and P2 of Sect. 7, 
which arc important in the analysis of STC systems. In contrast with Sect. 6.4, here 
we are not so much concerned about convergence to the true value of 9 or to the 
parameter variety 6, our interest being instead mainly directed to complementary 
properties of RLS which hold true even when no persistency of excitation is insured. 
We rewrite the RLS algorithm (6.3-13): 

*(*) = *(*-!)+ p (*" 1 M*- 1 ) - 



1 -rV(i - l)P(t - l)¥>(i - 1) 

[ y (t)-<p'(t-l)9(t-l)] (8.4-la) 

= 9(t-l) + P(t)<p(t-l)[y(t)-<p'(t-l)9(t-l)} (8.4-lb) 

Pit) = p(t i) - P(t-iMt-W(t-i)P(t-i) 

P^it) = P- 1 ^ - 1) + <p(t - l)ip\t - 1) (8.4-ld) 
with t G jZi, 6»(0) € R" e and P(0) = P'(0) > 0. 

Theorem 8.4-1 (Deterministic properties of RLS). Consider the RLS algo- 
rithm (1) with 

y{t) = (ft \t - 1)9 , 9eecE Al (8.4-2) 

where 6 is the parameter variety (3-2). For any 9eO, let 9(t) := 9(t) — 9 and 

s(t) := y(t)-ip'(t-l)9(t-l) (8.4-3) 
= -tp'(t-l)8(t-l) 



Then, it follows that: 
i. There exists 



lim P(t) = P(oo) = P'(oo) > (8.4-4) 



t — *oo 



ii. As t — > oo, 8(t) converges to 9(oo) 

9(oo) = 9 + P(oo)P- 1 (0)9(0) (8.4-5) 
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from which for any k g 

lim \\0(t) -6(t-k)\\ = (8.4-6) 



t — »oo 

|2 / ;„ ||/j/n>ll-' 



HW<M0(O)f (8.4-7) 

A max [P'HO)] A max [P(0)] 

1 A min [P-i(0)] A min [P(0)] 

: the condition number of P(0) 

^ 2 / 7 \ 

& g i + ^(fc-i)F(fc-iMfc-i) < °° (8 ' 4_8) 

and i/iis implies 

£ 2 (t) 

(a) lim , „ V = (8.4-9) 

( 7 t^cx> 1 + k 2 \\if{t- 1)|| 2 v ; 

fc 2 = A max [P(0)] 

t 



ft) lim V ||6»(fc)-6l(fc-l)j| 2 <oo (8.4-10) 

t — >oo — ' 

fe=l 

or more generally 



(c) lim 5Z||0(jfc) -6{k-i)f < oo (8.4-11) 

/or even/ positive integer i. 

Proof 

i. Since {P(t)}^L is a symmetric nonncgative-definite monotonically nonincrcasing matrix- 
sequence, (4) follows from Lemma 2.4.1. 

ii. We have 

p- l (t)6(t) = P- I (t~l)§(t-1) [(6.4-2f)] 

= p-^o^o) 

Hence, 

d(t) = 8 + P(t)p- 1 (0)9(0) 
Taking the limit for t — » oo and using (4) we get (5). 

iii. We recall that in (6.4-5) for V(t) := 0' (t)P- 1 (t)0(t) wc found 

V(t) = V(t - 1) - — / (8.4-12) 

1 + (p'(t - l)P(t - l)(p(t - 1) 

Thus V(t) is monotonically nonincreasing and hence 

e'(t)p- 1 (t)6(t) < 0'(O)p- 1 (O)e(O) (8.4-13) 

Now from (Id) 

A min [P _1 (*)] > A min [P-Ht - 1)] > A min [P-!(0)] 

Then 

A min [p- 1 ^)] H9(t)ii 2 < A min [p-Ht)} wmw 2 

< o'(t)p- 1 (t)§(t) 

< 9' (O)P- 1 (0)0(0) [(13)] 

< A max [p-!(o)] ||e(o)|| 2 

This establishes (7). 
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iv. Summing (12) from 1 to N, we get 



V(N) = V(0) - V 



e\k) 



Since V(iV) converges as TV — > oo (C/. Theorem 6.4-1), (8) immediately follows. 

(a) Eq. (9) is a consequence of (8) since 

Amax [P(t - 1)] < A max [P(t - 2)] < A max [P(0)]. 

(b) Eq. (10) is a consequence of (8) since 

\\9(k) - e(k - i)\\ 2 = 

<p'{k - i)p' 2 (k - i)(f(k - i) 



[i + ^'(fc-i)p(fe-i)^(fe-i)] 2 

< A max [P(k - 1)] 



e 2 (k) [(1)] 



< A max [P(0)] 



y'(t-l)f(t-l)#-l) _ 2 
1 + f'(k - l)P(fc - l)y.(fe - l)] 2 
e 2 (k) 



e\k) 



(c) We have 



\\9(k)-9(k-i) 



l + ip'(k-l)P(k-l)<p(k-l) 



E Mr) - 9(r - 1)] 



< i E \\9(r) - 9(r - 1) 

r=k — i+ 1 



In fact, setting v(r) := 9(r) — 0(r — 1), we find 



E "( r ) 



= J2\\v(r)f+J2v'(r)v(s) 

< EHOf + ^E [K'')ll 2 + Ks)!l 2 ] 



-;E ii^mii 2 



where the first inequality follows by Schwarz inequality. 

For a similar analysis of the Modified Projection algorithm (6.3-11), the reader is 
referred to [GS84]. 

Problem 8.4-1 (Uniqueness of the asymptotic estimate) Prove that 8(oo) given by (5) is not 
affected by v, where v = 9 — 9° G V, V being the subspacc of R" e in (3-2). [Hint: Show that 
v 6 V implies ip'(t — l)v = and hence P(0)P - (t)v = v ] 

Problem 8.4-2 Show that in the Modified Projection algorithm (6.3-11) we have 

< ||0(«-i)|| < 110(0)11 

and 

e 2 (k) 

nil 7 

t- 



um y - 



< oo 



Main points of the section In a deterministic setting the RLS algorithm fulfills 
Properties 1 and 2 of Sect. 3 required for establishing global convergence of STC 
systems. These results hold true in the absence of any persistency of excitation 
condition. 
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8.5 Self-Tuning Cheap Control 

We shall consider a SISO plant of the form 

A°(d)y(k) = B°(d)u(k) + c (8.5-1) 

with I/O delay r 

r :=oTdB°(d) > 1 

A°(d) and B°(d) coprime, dA°(d) — n a , dB°(d) = r + n& — 1, and c a constant 
disturbance. By setting 

A(d) := A(d)A°(d) and := A(rf)B°(rf) 

with A(rf) := 1 — d, (1) can be rewritten as follows 

A(d)y(k) = B(d)u(k) (8.5-2) 
= d T B a (d)u(k) 

where, similarly to (7.3-6), d T B Q (d) = B(d), with dB Q (d) = n b . Let (Q T (d), G T {d)) 
the minimum-degree solution w.r.t. Q T (d) of the Diophantine equation 



1 = A{d)Q T {d)+d T G T {d) 
dQ T (d) < T- 1 



(8.5-3) 



Then, we have 



y(t + T) = G T (d)y(t)+Q T (d)B (d)u(t) (8.5-4a) 
= a(d)y(t) + (3{d)u{t) 

where, provided that £, £ := r — 1, denotes the I/O transport delay 

a(d) := G T (d) 

= ciQ + axdA ha„/ a (8.5-4b) 

(3{d) := Q T (d)B (d) 

= (3 + (3 1 d+--- + (3 nb+l d nb+t /?o^0 (8.5-4c) 

Let h a > n a and h\,>n\,. Then we can write 

y(t + r) = ip'{t)9 (8.5-5a) 

*(*)==[ (yUj' «*>-<)']' eR** (8.5-5b) 
with 9 € 6, the parameter variety as in (3-2), = 0° + V, 

9°= [ a •••«„„ /3 •••/?„,+£ — Oj' 

n a — n a fib — rib 

Define the output tracking error 

e y (t + T) := y(t + r) - r(t + r) (8.5-6) 
= ^(t)e-r(t + T) 
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Where r is the output reference. If we know 6, Cheap Control chooses u(t), teS, 
so as to satisfy 

<p'(t)6 = r{t + T) (8.5-7) 

Note that for every 6 e 6 the n° + 1 component equals (3o ^ 0. This makes it 
possible to solve (7) w.r.t. u(t). The control law given implicitly by (7) makes 
the tracking error (6) identically zero, and the closed-loop system internally stable 
(Cf. Sect. 3.2) if the plant (1) is minimum-phase. 

Hereafter, we assume that 9 is unknown. We then proceed to design a self- 
tuning cheap controller (STCC), of implicit type, viz. an Enforced Certainty Equiv- 
alence adaptive controller whereby 9 is replaced by its RLS estimate 6(t) and the 
underlying control law is Cheap Control. In this way, if r > 1 we do not estimate 
the explicit plant model (1) or (2) but instead the T-step ahead output predic- 
tion model (4) which is directly related to the underlying control law (7) . For this 
reason, the adjective "implicit" is associated to the STCC defined below. 



Implicit STCC 

The parameter vector 9 is estimated via the following RLS algorithm, teSi, 

9(t) = 9{t - 1) + .^'"y -Mt) (8.5-8a) 

w v ' 1 + a(t)<p'(t - T)P{t - r)<p{t - t) w v ' 

e(t) = y(t) - <p'(t - r)9(t - 1) (8.5-8b) 

P(t - r + 1) = P(t - r) - ^)Pjt rMt-rWjt T )Pjt r) 
y ' y ' l + a(t)ip'(t-T)P(t-T)tp(t-T) y ' 

or 

P- 1 ^ - t + 1) = P- 1 ^ - t) + a(t)<p(t - T)<p'(t - t) (8.5-8d) 
In (8) a(t) is a positive real number chosen according to the following rule 

1 if the {(h a + l)th component of the R.H.S. of (8a) 
a(t) = ^ evaluated using a(t) = 1] =^ (8.5-8e) 
a otherwise, o/la fixed positive real 

Such a rule guarantees that the (n a + l)th component 6n a+ i(t) of 9(t) is nonzero, 
as required by the adopted control law 

<p'(t)0(t) =r(t + r) (8.5-9) 

The STCC algorithm is initialized from any 6(0) with 6*ft Q+ i(0) ^ 0, and any 
P(l - r) = P'(l - t) > 0. 

Problem 8.5-1 _ Consider the RLS algorithm (8) embodying the constraint that 
8 Aa+1 (t) ^ 0. Let 6(t) := 0(t) - 9 and V(t) := e'^P^it - r + l)6(t). Show that 

vit) = v(t-i). a(t)£2(t) 



1 + a(t)<p'(t - r)P(t - r)tp(t - t) 



(Cf. (4-12). Use this result to show that the algorithm still enjoys the same properties as in 
Theorem 4-1. 

We now turn on to analyze the implicit STCC (8), (9), by the tools discussed in 
Sect. 3 under the following assumptions. 



Sect. 8.5 Self-Tuning Cheap Control 



253 



Assumption 8.5-1 

• The plant I/O delay t is known. 

• Upper bounds n a and fib for the degrees of the polynomials in (4) are known. 

• The plant (1) is minimum-phase, i.e. B°(d)/d T is strictly-Hurwitz. 

• The reference sequence {r(t)}^l 1 is bounded: 

\r(t)\ <R<oo 



□ 

We first point out that, by the first two items of Assumption 1, 6{t) and the 
controller parameters are bounded. Therefore u{t) and y(t) are bounded for every 
finite t (no finite escape time). Further, (4-9) holds true for the estimation algorithm 
(8) (Cf. Problem 1) 

e 2 (t) 



Moreover, 



t^o l + fc 2 ||^(t-r)|| 2 



lim \\6(t)-6{t- 







(8.5-10) 
(8.5-11) 



Look now at the tracking error 



s y (t) 



y(t) - r(t) 

tp'{t - t)9 - <p'(t - T)9(t - t) 
-ip'(t - T)6{t - t) 



[(5)&(9)] 



(8.5-12) 



Thus, 



-e v (t) 



tp'(t - T)6(t - 1) - tp'(t - t) 6{t - 1) - 6(t - t) 



[l + fc 2 ||^-r)|| 



2ll/2 



[l + k 2 y(t-r)\ 



211/2 



Since <p'(t — r)8(t — 1) = — e(t), the limit of the R.H.S. is clearly zero from (10) and 
(11). Hence 

e 2 y (t) 



lim — 

t—oo 1 + k2\\y{t - T)\\ 



= 



(8.5-13) 



The aim is now to apply the Key Technical Lemma 3-1 with e(t) changed into e y (t). 
To this end, we need to establish the linear boundcdncss condition 



M*-t)|| < ci + c 2 max |e w (fc)| 



i.5-14) 



for some nonnegative reals c\ and c 2 . To prove (14), we rewrite (1) as follows 

B°(d) 



t(fc-r) = A°(d)y{k) + c 



We see that u(k — t) can be seen as the output and y(k) and c as inputs of a 
time-invariant linear dynamic system which by the third item of Assumption 1 is 
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asymptotically stable. Then, there exist [GS84] nonnegative reals mi and to 2 such 
that for all k € [l,t] 

luffe — r)| < mi + m 2 max (8.5-15) 

*e[i,t] 

with mi depending upon the initial condition {y(0), ■ ■ •, y(—n a + 1), u(— r), • • •, 
u(—rib — t + 1)}. Therefore, by (5b), 

||v(* _ r )ll < "9 \ m 3 + max(l,m 2 ) max \y(i) 
L L J *e[i,t] 

On the other hand, by boundedness of the reference, 

\e y (t)\>\ y (t)\-\r(t)\>\ y (t)\-R 

Hence 

\\<p(t — T )l — no \ m 3 + max(l,m 2 ) max \e y {k) + R\ 

= ci + c 2 max \e v (k)\ 
ke[i,t] 9 

for nonnegative reals c\ and c 2 . The above discussion and Problem 2 below establish 
global convergence of the implicit STCC. 

Theorem 8.5-1. (Global convergence of the implicit STCC) Provided that 
Assumption 1 is satisfied, the implicit STCC algorithm (8), (9), when applied to 
the plant (1), for any possible initial condition yields: 

i. {y(t)} and {u(t)} are bounded sequences; (8.5-16) 

ii. lim [y{t) - r{t)] = 0; (8.5-17) 

t 

Hi. lim V \y(k) - r(k)] 2 < oo. (8.5-18) 

k = T 

Problem 8.5-2 Prove (18) by using Theorem 4.1, part iv 

lim V — < oo (8.5-19) 

l + <fi'(k-T)P{k-T)(p(k-T) 

t 

lim V \\6{t) - 6(t - i)\\ 2 < oo (8.5-20) 

k — i 

the relationship between e(k) and e y (k) 

s y (k) = e(k) - ip'(k - t) \§(k - 1) - 6(k - r)] (8.5-21) 
Schwarz inequality, and boundedness of {f{k — t)} as implied by (16). 

The theorem above is important in that it establishes that, irrespective of the initial 
conditions, for the implicit STCC system: 

• closed-loop stability is achieved; 

• the output tracking error asymptotically vanishes; 
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• the convergence rate for the square of the output tracking error is faster than 
lit. 

It is to be pointed out that such conclusions are obtained without assuming conver- 
gence of the estimated parameter vector 9(t) to the parameter variety 6. In fact, no 
claim that self-tuning occurs can be advanced. On the other hand, the minimum- 
phase assumption is quite restrictive and, as we know, an unavoidable consequence 
of the adopted underlying control. We can try to remove such a restriction by using 
a long-range predictive control law. In the next chapter we shall see that this is 
surprisingly complicated if global convergence of the resulting adaptive system has 
to be guaranteed. 



Direct STCC 



We assume hereafter that the reference to be tracked is a constant set-point. Then, 
r(k + 1) = r(k) = r, k e Zi. Hence, denoting by e y (k) := y(k) — r the output 
tracking error, similarly to (2) we can write 



A(d)s y (k) = B(d)u(k) 

= d T B {d)u{k) 

Hence, similarly to (3) and (4), we have the predictive model 

e v {k + r) = a(d)e y {k) + /3(d)u(k) 



3.5-22) 



(8.5-23) 



Here Cheap Control consists of choosing u(k) so as to make the L.H.S. of the above 
equation equal to zero: 



u(t) 



0o 

= -f's(t) 



a{d) e y (t)-^ d \^<(t) 



0o 



up 
00 



00 



01 — 00 0n, 



0o 



00 00 



(8.5-24) 
(8.5-25) 
(8.5-26) 



We then see that (23) can be reparameterized in terms of the Cheap Control feed- 
back vector / 

e y (k + T) 



0o 



u(k) = f's(k) 



(8.5-27) 



Then, when (3q is known, a direct STC algorithm can be obtained as follows. From 
the observations 



z(t) := £ -M-u(t-r) 
0o 

= s'(t-r)f , 



(8.5-28) 



recursively estimate the Cheap Control feedback vector /. Let f(t) be the estimate 
of / based on z*. E.g., such an estimate can be obtained by the Modified Projection 
or the RLS algorithm. Specifically, in the latter case 



/(*) = f(t-l) + 

P(t - T)s{t - T) 

1 + s'(t - r)P(t - r)s(t - t) 



3.5-29) 



[ Z (t)-s'{t-r)f{t-l)\ 
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P(t-T+l)=P(t-T) 



P{t-T)<p{t-T)<ff{t-T)P{t-T) 



5.5-30) 



1 + s'(t - r)P(t - r)s(t - t) 
with /(0) and P(l — r) = P'(l — r) > arbitrary. For the next input it(i) set 

«(*) = -f'(t)s(t) (8.5-31) 

We can still try to use the above direct STCC if, though po is not exactly known, 
a nominal value Po of Po, Po ~ Po, is available. In such a case, to estimate / in 
(29) we use, instead of z(t), the observations 



Z(t) ;= £ -M-u{t-T) 
Po 



t e TL\ 



(8.5-32) 



How close to fa should po be in order to possibly make the direct STCC work? To 
find an answer we resort to the following simple argument. Multiply each term of 
(29) by p(t - t) := s'(t - r)P(t - r)s(t - r) to obtain 

p(t - r)s'(t - r)f(t) + s'(t - t) {f(t) - f(t - 1)] = p(t - r)z(t) 

Assuming that f(t) = f(t— 1), we get 



1 

Jo 
1 



s'(t-T)f(t) S -j- e y (t) - p u(t - t) 



[(32)] 



Po 
s'(t-r) 



(p - po) u(t -t)+ p s'(t - r)f 



[(27)] 



Po Po 



[(31)] 



This equation is valid in closed-loop and yields an updating equation for f(t) with 
each term prcmultiplicd by s'(t — T). We can conjecture from such an equation that 
the evolution of f(t) is stable provided that |(/3q — Po)/Po\ < 1, or equivalently 



0<^<2 
Po 



5.5-33) 



Therefore, (33) looks like a stability condition that must be guaranteed to the 
direct STCC. This conditions was in fact pointed out in [ABLW77], and later in 
[GS84] , to be required for global convergence of the direct STCC algorithm under 
the additional assumption that the plant has minimum-phase and its I/O delay r 
is known. Note that (33) is equivalent to 



Po 



< Po < oo 



-oo</3 <| 



(Po > 0) 



(Po < 0) 



(8.5-34) 



In particular, it requires that po and Po have the same sign. Simulation experience 
[GS84] suggests, however, that a practical range for Pol Po be (0.5, 1.5). 



Main points of the section An implicit STCC system with a global convergence 
property can be constructed, even if no claim that self-tuning occurs for its feedback 



Sect. 8.6 Constant Trace Normalized RLS and STCC 



257 



gain can be advanced. It is restricted to minimum-phase plants with known I/O 
delay. The direct STCC system (29)-(31) has the additional restriction that the 
sign of the first nonzero coefficient of the plant P(d)-polynomial must be known 
together with its approximate size. 



8.6 Constant Trace Normalized RLS and STCC 



An important reason for using adaptive control in practice is to achieve good per- 
formance with time- varying plants. In this case the use of an identifier with a 
finite data memory is required; Cf. (6.3-40)-(6.3-43). To this end, we first discuss 
in detail the properties of a constant trace RLS algorithm. We next show that a 
STCC system equipped with such a finite data memory identifier still enjoys the 
compatibility property of being globally convergent when applied to time-invariant 
plants. 

Constant Trace Normalized RLS (CT-NRLS) 

With reference to the data generating mechanism (4-2), we define the normalization 
factor 

i(t— 1) := max {m,M*-l)||} , m>0, (8.6-la) 



mi 



and the normalized data 
7(*) = 



y(t) 



x(t-l) := 



¥>(*-!) 



(8.6-lb) 



m(i-l) m(t-l) 
The estimate 0(f) based on 7* and a;*" 1 , t e 7L X , is given by (Cf. (6.3-22), (6.3-43)) 

P(t - l)x(t - 1) 



6(t) = 0(t-l) 



l + x'(t-l)P(t-l)x(t-l) 
9(t - 1) + X(t)P(t)x(t - l)e(i) 



p( t ) = 



X(t) 



s(t)= 7 (t)-x'(t-l)6(t-l) 

P(t - l)x{t - l)x'(t - l)P(t - 1) 



P(t-1)- 



1 + x'(t - l)P(t - l)x(t - 1) 
P _1 (t) - A(t) [P-\t - 1) + x(t - l).x'(t - 1)] 
1 x'(t- l)P 2 (t- l)x(t- 1) 



A(t) 



Tr[P(0)] 1 + x'(t - l)P(t - l)x(t - 1) 



(8.6-lc) 

(8.6-ld) 
(8.6-le) 
(8.6-lf) 
(8.6-lg) 



with 0(0) e IR™ 6 arbitrary and P(0) = P'(0) > 0. 

As discussed in connection with (6.3-43), the above algorithm has the constant 
trace property Tr[P(i)] = Tr[P(0)]. Further, data normalization is adopted so as 
to insure that X(t) be lowerbounded away from zero. 



Lemma 8.6-1. In the CT-NRLS algorithm (1) we have 

1 



1 + Tr[P(0)] 



< A(i) < 1 



(8.6-2) 
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Proof For the sake of brevity, we omit the argument t — 1 throughout the proof. First, note 
that, since P = P' , P can be written as L' L. Further, sp(L'L) = sp(LL') and by normalization 
\\ x \\ < L sp(M) denoting the set of the eigenvalues of the square matrix M . Hence, 

x'P 2 x 



Then, 



< X m ^(P)x'Px < x'Px < Tr[P(0)] 
Tr[P(0)](l + x'Px) ~ Tr[P(0)](l + x'Px) ~ 1 + x'Px ~ 1 + Tr[P(0)] 

x'P 2 x „ . Tr[P(0)] 1 



1 



Tr[P(0)](l + x'Px) 



> 1 



1 + Tr[P(0)] 1 + Tr[P(0)] 



Problem 8.6-1 Consider the CT-NRLS algorithm (1) with y(t), (gSi, satisfying (4-2). Let 
8(t) := 6{t) - 9 and V(t) := e'^P' 1 (t)0(t). Show that 



V(t) = \(t) 



V(t-l) 



e 2 (t) 



Conclude that: 



lieWII 2 < 



l + x'(t - l)P(t - l)x(t - 1) 
Tt[P(0)]V(0) 



n ^) 



(8.6-3) 
(8.6-4) 



and {V(t)} converges. 



Problem 8.6-2 Under the same assumptions as in Problem 1, show that 

P _1 (t)0(t) = A(t)P _1 (t - 1)0 (t - 1) 

and hence 



§(t) 



n a o) 



2 = 1 



P^P" 1 (0)0(0) 



Let 



and 



(5(oo) := lim 5(t) 



Then, from (4) it follows that 
6(oo) = 



6(oo) := lim 6{t) = 9 



(8.6-5) 

(8.6-6) 
(8.6-7) 
(8.6-8) 



If, on the contrary, <5(oo) > 0, Lemma 2 below establishes the existence of P(oo) := 
Hindoo P(t). Hence, from (5) 

(5(oo)>0 => 9(oo) = 6' + (5(oo)P(oo)P- 1 (0)6(0) (8.6-9) 

Lemma 8.6-2. Consider the CT-NRLS algorithm (1). Then, there exists 

lim P(t) = : P(oo) = P'(oo) > 

whenever 5(oo) > 0. 

Proof Let P i:j (t) the (i, j)-entry of P(t). Then, since P(t) = P'(t), 

P i:j (t) = e'iP(t) ej = \ {(e* + ej )' P(t) ( ei + ej )- e'iP(t)ei - e^P(t)e,} 

where {ej}^fi denotes the natural basis of R" 9 . Consequently, Pij(t) converges if the quadratic 
form z' P(t)z converges for any z £ R ne . Now 

1 



Z ' P ^ Z = -j^[z'P(t-l)z-r(t-l)] 



[(le)] 
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where 



r(t - 1) 



[z'P(t~ l)s(t- l)] 2 
l + x'(t- l)P(t - l)x(t - 1) 



> 



By iterating the above expression, 



r(j - 1) 



z'P(0)z _ ^ 



n m 

L J=2 



> 



Since, 5(oo) > 0, it remains to show that the last term converges. This can be proved as follows 



t 

E 



r(3 ~ 1) 

ri m 



1 * 

-r—Y. 

n m ^ 



2-1 

n a « 



r(j - 1) 



This together with the equation above shows that 

t 



E 

2 = 1 



2-1 

n a « 

i=i 



rtf - 1) > < z'P(0)z 



Now the L.H.S. of the above inequality is nondecreasing with t and upperbounded by z' P(0)z. 
Hence, it converges. 

Problem 8.6-3 Consider the CT-NRLS algorithm. Use convergence of {V(t)} established in 
Problem 1 to show that 



lim y X(k) ^ — 

*^°°^ L l + x'(fc-l)P(fc-l)x(fc-l) 



and 



lim e 2 (k) < oo 



< OO 



Problem 8.6-4 For the CT-NRLS algorithm, establish that 



and more generally 



lim V \\9{k) - 6{k - l)\\ 2 <oo 
fe=l 

t 

lim y^\\e{k)-e(k- 



< OO 



for every positive integer i. 

The above results are summed up in the following theorem. 

Theorem 8.6-1. (Deterministic properties of CT-NRLS) Consider the 
CT-NRLS algorithm (1) with 



y(t) = ip'(t -1)9 , 9 e 6 c IT" 



(8.6-10) 



where 9 is the parameter variety (3-2). Then, for any 9 e 6, 9(t) :— 9{t) — 9, and 
(5(oo) as in (8), we have: 



i. S(oo) > 



lim P{t) = P(oo) = P'(oo) > 

t — >oo 



(8.6-11) 
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as t — > oo, 9(t) converges to 6(00) where 

^°° J ~\ ^ + 5(oo)P(oo)P- 1 (0)6>(0) , 5(oo)>0 j 
/rora which /or any k € "E 

lim ||0(t) -0(t-fc)|| =0 (8.6-13) 



ll^)ll 2 < 



t 



t 

/ - X 



Tr[P(0)]F(0) (8.6-14) 



lim Ve 2 (fc) < 00 (8.6-15) 

: — >oo — ' 



and this implies 

a. lim e(t) = (8.6-16) 



t 



t— >oo ' 

fe=l 



6. lim V ||0(jfc) -0(Jfe- 1)|| 2 < 00 (8.6-17) 

or more generally 

t 

c. lim V \\9(k) -9(k-i)\\ 2 < 00 (8.6-18) 



k—i 

for every positive integer i. 

Problem 8.6-5 Consider the CT-NRLS algorithm (1). Let 



P- 1 ^) := P _1 (0) + ^ ^(fe)^'(fe)- 



fc=0 



Then show that A m ; n j^P ^tjj — > 00 as t — > 00 implies that 5(oo) = and hence, provided 
that (8) holds, 9 e 0. [Pint: First show that, for t S Zi, (t)P _1 (t) = P _1 (0) + 

El= 5-1 1 

It is to be pointed out that some of the properties of Theorem 1 would not hold 
true for a constant-trace RLS algorithm with no data normalization. In fact, data 
normalization is essential to guarantee that A(i) > in (2), and, consequently, the 
result of Problem 5 and {s(k)} € £2 in (15), and, hence, (16)— (18) . 

Implicit STCC with CT-NRLS 

Hereafter, we shall consider a variant of the implicit STCC of Sect. 8.5 whereby 
the RLS estimates are replaced by estimates 8(t) supplied by a CT-NRLS iden- 
tifier modified so as to keep 9h a +i{t) 7^ 0. Specifically, the parameter vector 9 is 
estimated via the following CT-NRLS algorithm, t € 1L\, 
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P(t-T+1) 



1 

W) 
A(t) = 1 - 



P(t-T) 
1 



a(t)P(t - r)x(t - T)x'(t - r)P(t - r) 



1 + a(t)x'(t - r)P(t - r)x(t - r) 
a(t)x'(t-T)P 2 (t-T)x(t~ T ) 



Tr[P(0)] 1 + a{t)x'{t - r)P(t - r)x(t - r) 
where a(t) is as in (5-8e), 

e(t) = y(t)-ip'{t-T)6{t-l) 

w m(t - t) 

= j(t)-x'{t-T)0{t-l) 

m(t — t) := max jm, \\<p(t — r)|| | , 
As in (5-9), the control law is given by 

<p'(t)9(t)=r(t + T) 



m > 



(8.6-19b) 
(8.6-19c) 

(8.6-19d) 
(8.6-19e) 

(8.6-19f) 
(8.6-20) 



The algorithm is initialized from any 6*(0) with 8n a +i ^ 0, and any P(l — t) = 
P'(I-t) > 0. 

Problem 8.6-6 Consider the CT-NRLS algorithm (19) embodying the constraint that 0ft a +i (i) ^ 
0. Let V(t) :=e'(t)p- 1 {t-r + l)e{t),e{t) :=6{i)-e. Show that 

a(t)e 2 (t) 



V(t) = X(t) 



V(t-1)- 



1 + a(t)x'(t - r)P(t - r)x(t - r). 

(Cf. (3)). Use this result to show that the algorithm still enjoys the same properties as in Theorem 
1. 

The conclusion of Problem 5 allows us to follow similar lines as the ones leading to 
Theorem 5-1 to prove the next global convergence result. 

Theorem 8.6-2. (Global convergence of the implicit STCC with CT 
NRLS) Provided that Assumption 5-1 is satisfied, the implicit STCC algorithm 
based on the CT-NRLS (19) and the control law (20), when applied to the plant 
(5-1), for any possible initial condition, yields: 



i. {y(t)} and {u(t)} are bounded sequences; 
ii. lim [y(t) - r(t)} = 0; 



t— »oo 



ii. lim [y{k) — r(k)] 2 < oo 

t — *oo — ^ 



(8.6-21) 
(8.6-22) 

(8.6-23) 



k = T 



Problem 8.6-7 Prove Theorem 2 [Hint: Apply first the Key Technical Lemma 3-1 to prove 
(21) and (22). Next, exploit (21) to prove (23). ] 



Main points of the section In order to achieve good performance with time 
varying plants, it is advisable that adaptive controllers be equipped with identifiers 
with finite data memory, e.g. constant-trace RLS. The implicit STCC controller 
(19), (20), based on a constant trace RLS identifier with data normalization, when 
applied to the plant (3-1) enjoys the same global convergence properties as the 
implicit STCC system (8), (9), based on the standard RLS identifier. 
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8.7 Self— Tuning Minimum Variance Control 

8.7.1 Implicit Linear Regression Models and ST Regulation 

We now turn to consider self-tuning control of a SISO CARMA plant 

A(d)y(t) = B(d)u(t) + C(d)e(t) (8.7-1) 

satisfying all the assumptions of Theorem 7.3-2. In particular we focus our attention 
on the Minimum- Variance (MV) regulator given by (7.3-17) for p = 

Q T (d)B (d)u(t) = -G T (d)y(t) (8.7-2) 

where 

C(d) = A(d)Q T (d) + d T G T (d) , dQ T (d) < r - 1 

and, as usual, r denotes the plant I/O delay. We remind the prediction form (7.3-9) 
of (1) 

C(d)y(t + t) = Q T (d)B (d)u(t) + G T (d)y(t) + C(d)Q T (d)e(t + r) (8.7-3) 
Now, take into account that by the regulation law (2) in closed-loop 

y(t) = Q T (d)e(t) (8.7-4) 

to write 

y(t + r) = Q T (d)B (d)u(t)+G T (d)y(t) + 

[1 - (7(d)] y{t + t) + C(d)Q T (d)e(t + t) 
= Q T (d)B (d)u(t) + G T (d)y(t) + 

{[1 - C(d)} Q T (d) + C(d)Q T (d)} e(t + t) 

= Q T (d)B (d)u(t) + G T (d)y(t)+Q T (d)e(t + T) (8.7-5) 

Therefore, we conclude that the MV-regulated system admits the output prediction 
model (5). Note that the term v(t + r) := Q T (d)e(t + r) is a linear combination 
of e(t + r), - ■■ ,e(t + 1). Then, £{ip(t)v(t + r)} = if <p(t) is any vector with 
components from y 1 and u*. Hence, by the same argument as in (6.4-13), we can 
conclude that (5) is a linear regression model in that the coefficients of the poly- 
nomials Q T (d)B (d) and G T (d) can be consistently estimated via linear regression 
algorithms, such as the RLS algorithm. Note also that the model (5) includes the 
same polynomials that arc relevant for the MV regulation law (2). Thus, if the plant 
(1) is under MV regulation, it is reasonable to attempt to estimate the parameter 
vector 9 of the coefficients of the polynomials Q T (d)Bo(d) and G T (d) in the linear 
regression model (5) via a recursive linear regression identifier, e.g. standard RLS. 
This is a significant simplification over the explicit or indirect approach consisting 
of identifying the CARMA model (1) via pseudolinear regression algorithms. 

This route is the transposition to the present stochastic setting of the one fol- 
lowed in the deterministic case which led us to consider the implicit STCCs of the 
last two sections. We insist on pointing out that (5) is not a representation of the 
plant. In fact, it only yields a correct description of the evolution of the closed- 
loop system in stochastic steady state provided that MV regulation is used and 
the regulated system is internally stable. Such a closed-loop representation will 
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be referred to as an implicit model and the related adaptive MV regulator briefly 
alluded to after (5) as an implicit MV Self-Tuning (MV ST) regulator. We see 
that MV regulation acts in such a way that the original CARMA plant (1) behaves 
in closed-loop as the implicit linear regression model (5). MV regulation is not the 
only regulation law under which a CARMA plant admits an implicit linear regres- 
sion model. In the next chapter we shall see that this holds true for LQS regulation 
as well. This implicit modelling property is important in that it can be used to 
construct implicit self tuning LQS regulators. 

It is not obvious that the implicit MV ST regulator based on the above op- 
timistic reasoning will actually work. It is instructive to answer this issue by 
analysing in detail a MV ST regulator of implicit type. For the sake of simplicity, 
we assume throughout that the plant I/O delay r equals 1. In such a case we have: 



A(d) = l + aid + 
B(d) = hd + 

C(d) = l + dd+- 



Qi(d) 

the MV regulation law 



1 



and 



Or. 



d n " 
d n b 



■ ■ + c n d n 



(bi * 0) 



dGi{d) = C{d) - A(d) 



(8.7-6) 



3.7-7) 



u(t) 



1 



Li=l 

and the implicit CAR model 



i=2 



where 



y(t) = (ci — a\)y(t — 1) H h (c ft - a A )y(t - h) 

hu(t — 1) H h b A u(t -h) + e(t) 

= y'(t-l)6 + e(t) 



h > max {n a , rib, n c } 

<p(f-i):=\(y^)' K = i)' 



:= [ ci - ai 



a A b\ ■■ ■ b A 



1.7- 



(8.7-9a) 

(8.7-9b) 
(8.7-9c) 
(8.7-9d) 



8.7.2 Implicit RLS+MV ST Regulation 

To show that the above approach may lead to the desired result, we consider an 
example. 



Example 8.7-1 (An implicit RLS+MV ST regulator) Consider the CARMA plant 

y(t + 1) = -ay(t) + bu(t) + ce(t) + e(t + 1) (8.7-10) 

where e is a zero-mean white noise with £{e(t)} = cr 2 and |c| < 1. We assume that RLS with 
regressor ip(t) := [ y(t) u(t) ] ' arc used to estimate 8 := [ a b ] ' in the implicit CAR model 

y(t) = [ y(t- 1) u(t - 1) ] 



■e(t) 



(8.7-11) 
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which results from (10) under MV regulation 

u(t) = 



-y(t) 



1.7-12) 



Let 9(t) = [ a(t) b(t) ]' be the RLS estimate of 9 based on [y t ,u t ~ 1 ). Then, the next input 
u(t) is chosen, according to Enforced Certainty Equivalence, so as to satisfy <p'(t)6(t) = or, 
explicitly, 



u ( t ) = _^L( t ) 
v ; b{t) UK ' 



(8.7-13) 



Such an adaptive regulator will be referred to as the RLS+MV ST regulator. 

Assume that all variables stay bounded and 0(t) converges to 0(oo) :=[ a b ] . Then, by 
(6.3-14), 6(oo) satisfies the normal equations 



Hm - y 



fc=0 



y 2 (k) y(k)u(k) 
y(k)u(k) u 2 (k) 



t-i 

hm - y 



y(k)y(k + 1) 
u(k)y(k + 1) 



(8.7-14) 



Now the L.H.S. of this equation under the asymptotic regulation law 

u(t) = -jy(t) 
b 

is found to be zero. Hence, under (15), the R.H.S. of (14) is zero and this reduces to the condition 



1.7-15) 



1 t_1 

ti™ 7 H y( k )y( k + 1) = o 



(8.7-16) 



fc=0 



To find the implications of (16), we use (15) into (10) to get 
y(k + 1) 
a 



-ay(k) + ce(fe) + e(k + 1) 
b , 

a + -a 
b 



1.7-17) 



Multiplying each term by y(k) and taking the time average and next the limit, we get 

^ t— 1 ^ t-i 

lim _ y y(k)y(k + 1) = lim -V r-ay 2 (fc) + ce(fc)j/(A:) + e(/c + l)y(fc)l 

t— »oo £ ^ — ' t— »oo f * ' L 



fc=0 



2 2 

-a<T y + c<r 



(8.7-18) 



Here we have used the fact that 



because of whiteness of e, 



1 t_1 

lim ~y e(k + l)y(k) = 



1.7-19) 



fc=0 



and by (17) 
and hence 



e(k)y(k) = ae(k)y(k - 1) + ce(k)e(k - 1) + e 2 (k) 
1 4-1 

lim - Y] e(k)y(k) = a 2 . 

t^oo t f — \ 

From (16) and (18) it follows that 



fe=0 



-aa 2 + ccr 2 = 



1.7-20) 



To evaluate a 2 we use (17) to get 



a 2 = a 2 a- 2 - 2aar 2 + (1 + c 2 )a 2 



2 1 — 2ac + c 2 2 

T y = ^ ~o a 

" 1 - a 2 



(8.7-21) 
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Substituting (21) in (20) wc get the following quadratic equation in a 



1 - 2ac + c 2 
1 1 - a* 







whose roots are a 
system is 



c and a 



1/c. The only solution corresponding to a stable closed-loop 
a = c (8.7-22) 



Then, it follows from (17) 



a 
b 



(8.7-23) 



which according to (12) and (15), is the MV regulation gain. Thus the conclusion is that if the 
RLS+MV adaptive regulation law (13) converges to a stabilizing regulation law, then self-tuning 
occurs. Note, however, that 9(t) = [ a(t) b(t) ] ' need not converge to [ c — a b ] '. In fact it 
is enough if the ratio converges to (c— a)/b. This is insured if v(t) converges to a random multiple 
of 8, viz. limt^oo 8(t) = v8, where v is a scalar random variable. 



8.7.3 Implicit SG+MV ST Regulation 

The results of Example 1 are encouraging in that they indicate that in the RLS+MV 
ST regulator self-tuning occurs whenever the adaptive regulator converges to a 
stabilizing compensator. This is the celebrated w.s. self-tuning property of the 
adaptive RLS+MV regulator which, for the first time, was pointed out in the seminal 
paper [AW73]. However, no insurance of convergence is given. We next turn to 
discuss global convergence of an adaptive MV regulator based on a Stochastic 
Gradient (SG) identifier. The reason for this choice is that global convergence 
analysis for the RLS+MV ST regulator is a difficult task. For some results on this 
subject see [Joh92]. 

Consider then (l)-(9). Let the parameter vector 



6 = [ ai • 

be estimated via the SG algorithm 



at, 



bi 



9{t) = 9{t-l) + a -^L^e{t) 

e(t) = y(t)-y>( t -l)6{t-l) 
q(t) = q(t-l) + Mt-l)r 



a > 



(8.7-24a) 

(8.7-24b) 
(8.7-24c) 



with t e Zi, 6>(0) e IT 



ne 



2h, and q(0) > 0. Then, u(t) is chosen according 



to the Enforced Certainty Equivalence as follows 

<p'{t)6{t) = 



(8.7-25a) 



or 



u(t) 



1 



h(t) 
s(t) :- 



ai(t) ■■■ a A (t) b 2 (t) 
(vt-n+i)' ' ( ut t-\+i)' 



6ft (t) ] s(t) 



e R 



ne — l 



(8.7-25b) 



The algorithm (25), (26), will be referred to as the SG+MV ST regulator. To 
analyse the algorithm we make the following stochastic assumptions. The process 
{(f(0), e(l), e(2), • • •} is defined on an underlying probability space (fi, T, P), and 
we define Tq to be the a-field generated by (fi(0). Further, for all t eKi, T t denotes 
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the (7-field generated by {ip(0),e(l), • • • , e(t)}. The following martingale difference 
assumptions are adopted on the process e 

£ {e(t) | T t -i} = a.s. (8.7-26a) 

£ {e 2 {t) | T t -i) = 1 a.s. (8.7-26b) 

£ |e 4 (t) | T t -i) < M < oo a.s. (8.7-26c) 

e(t) has a strictly positive probability density (8.7-26d) 

The last condition implies, [MC85] [KV86], that the event {b\ (t) — 0} has zero 
probability, and hence the control law (25b) is well defined a.s.. In the global 
convergence proof of the SG+MV ST regulator of next Theorem 1, which is based 
on the stochastic Lyapunov function method (Cf. Sect. 6.4), we shall avail of the 
following lemma. 



Lemma 8.7-1. [Cai88] (Passivity and Positive Reality) Consider a 
time-invariant linear system with p x p transfer matrix H(d). Let H(d) be PR 
(Cf. (6.4-34))- Then, the system is passive, i.e. for all input sequences {u(k)}^ =0 
and corresponding outputs {y{k)}^ =0 



N-l 



u '( k )y( k ) > K 



k=0 

for some constant K. 

Theorem 8.7-1. (Global convergence of the SG+MV ST regulator) Con- 
sider the CARMA plant (1), (6), where the innovations e satisfy (26), regulated by 
the SG+MV algorithm (24), (25), with n as in (9b). Suppose that 

the plant is minimum-phase (8.7-27) 

and 

C(rl\ - 



C(d) is SPR, (8.7-28) 



Then, 

\\6(t) -6\\ converges a.s. (8.7-29) 
the input sample paths satisfy 



1 t_1 

limsup- y^u 2 (k) < oo a.s. (8.7-30) 

t^oo t 



k=0 

and the adaptive system is self-optimizing, i.e. 

t 

• ■ / 



1 * 

lim -V y 2 (k) = a 2 a.s. (8.7-31) 

fc=l 

Proof Let V(k) := ||e(fc)|| 2 , 6{k) := 6(k) - 6, with 9 as in (9d). Considering (25a), we find 
§(k + 1) = 0(k) + aq- 1 ^ + l)ip(k)y(k + 1). Hence, 

V(k + 1) = V(k) + -£— v '(k)6(k) V (k + 1) + 2 f J |y(fc)||V(* + 1) 
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Taking conditional expectations 
£{V(k + l) \f k } 



V(k)- 



2a 



q{k + l) 



<p'{k)8{k)£{y{k + l)\T k } 



Further, from (1) 
and, in turn, 
Moreover, 



— — -\W{k)\\ 2 £ {y 2 (k + i)\r k } 

q A {k + 1) 

y(k + !)=£ {y(k + 1) | T k } + e(k + 1) 
£{y 2 (k + l) | T k }=£ 2 {y(k + l) \ T k } + a 2 



(8.7-32) 
(8.7-33) 



C(d)£{y(k + l)\F k } 

Therefore, setting 
we get 

£{V(k + l) \F k } 



C{d) [y(k + 1) - e(fc + 1)] [(32)] 
[C(d) - A{d)\ y(k + 1) + B(d)u{k + 1) 
<p'(k)0 = -<p'(k)§(k) [(25a)] 



[(1)] 



z(k) :=£{y(k + l)\F k } 



(8.7-34) 



V(k)- 



2a 



q(k + 1) 



C(d) - - 



a || V (fc)|| s 



2g 2 (fe + l) 



z(k) \ z(k) + 



q 2 (k + l) 



\Mk)\\ 2 * 2 



< V(k) 



2a 



q(k + 1) 



q 2 (k + l) 



\Mk)\\ 



V(k). 



ay 



2a 

q(k + 1) 

-z 2 (k)- 



llvWII 

q(k + l) 
(*)} *(*) 



< 1 



C(d) 



a + 7 



2 

■ll^fc)!! 2 



+ V "' ' q 2 (k + l)' 
where 7 > is such that C(d) — (a + 7)/2 is PR. Let, for an appropriate K, 

k 



■2aJ2 



C(d) 



a + 7 



z(i) } z(i) + K> 



Such a K exists since C'(d) — (a + y)/2 is PR and by virtue of Lemma 1 . Using S k we can write 



£{V(k + l) \F k }< 



V(k) 



S k — Sfc-i 



q(k + 1) 



a-yz 2 (k) 

# + ' 4 2 (fc + l) 



+ 



II^WII 5 



or, setting 



M(fc) := V(fc) + -AJ- > 
g(fc) 



since <j(A:) < </(fc + 1) we obtain the following neai — supermartingale inequality 

„„2 



£ {M(k + 1) I ^fc} < 



07 



g^fc + l)" 7 ^"" 1 g(fe + l) 
To exploit the Martingale Convergence Theorem D.6-1, we must check that 

^ Mfc)il 2 f 

2^ — ~ < OO 



z\k) 



.7-35) 



.7-36) 



This follows since 
N 



E 



\Mk)\\ 2 
q 2 (k + l) 



^ q(k + l)~q(k) < " q(k + 1) - q{k) 
2-^ „2(h.A-A\ ~ £-f q(k + l)q(k) 



q 2 (k + l) 
1 1 



fc=0 
JV 

£ L?(fc) ?(fc + i) 



1 



1 



q(0) q(N + 1) 



1 



9(0) 



< 00 
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Applying the Martingale Convergence Theorem D.l to (36), we obtain 

M(k) — > M(oo) a.s. 

and, also 

^ z 2 (k) 



E 

fc=0 



< oo a.! 



q(k + 1) 

By (37), then (29) follows. Since q(k + 1) > 0, if 

lim q(k) = oo a.! 

k—>oo 

by Kroncckcr's lemma (Result D-3) we conclude that 



lim 

q(N + 1) 



1 N 

-V z 2 (k) = a.s. 

+ 1 f-r. 



(8.7-37) 
(8.7-38) 

(8.7-39) 
(8.7-40) 



We show that (39) holds by contradiction. Suppose that linife^^ q(k) < oo. This implies that 
limfe^oo y 2 (k) = lim^^ u 2 (k) = 0. From this, because of (1), it follows lim^^ e(k) = 0. But 
this happens only on a set in f2 of zero probability measure since by (26) 



k=l 



Hence (39) follows. 

We next want to show that 



f g(0 1 °° 



is bounded a.s. 



i.7-41) 



1.7-42) 



from which (30) follows at once. Since the plant is minimum-phase, we can argue as after (5-14) 
to show that 

t-i t-l 
7 £« 2 (*) ^ tE [ciy 2 (k + l) + c 2 e 2 (k + l)] 

1 fe=0 * fc=0 

< 7E^ + 1) + C3 [(41)] 



Therefore 



t + i 



^ ^jJ2v 2 (k + i) + c 5 



fc=0 

t 



fc=u 

where last inequality with cq > follows from (32), (34) and (41). Then we have 

q(t + 1) - (t + l)c 7 



-1— V 2 2 (fc) > 



«(* + 1) c 69 (t + 1) 

Suppose now that (42) is not true. Then along some subsequence {t^}, we get 

1 ^ X 
lim — V z 2 (j) > — > 



i=0 



C6 



which contradicts (40). Then (42) holds. 
We finally prove (31). 



t t 

E = E l z2 ( k ~ J ) + 22 ( fc - ^t*) + e2 ( fc )] [(32)&(34)] 



By Schwarz inequality 



E z(k - l)e(fc) < 
fc=i 



E^(fe-i) 



-I 1/2 




1/2 




E e2 ( fc ) 
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Since by (40) and (42) 

1 * 

lim - V z 2 (k - 1) = a.s. (8.7-43) 
* ti 

(31) follows by virtue of (41). 

Theorem 8.7-2. Under the same assumptions as in Theorem 1 with the only ex- 
ception of (28) replaced by 

C{d) is SPR (8.7-44) 

the SG+MV ST regulator for every a^O in (24) has all the properties of Theorem 
1 except possibly for (29). 

Problem 8.7-1 [KV86] Prove Theorem 2. [Hint: Use Theorem 1, the fact that the control 
law (25) is invariant if Q(t) is changed into ad(t), a £ R, and that if 6(0) and a are changed into 
a8(0) and aa, respectively, 9(t) is changed into a0(t) in the SG+MV ST regulated system as can 
be proved by induction on t. ] 

Conditions under which self-tuning occurs are given below. 

Fact 8.7-1. [KV86] Assume the conditions of Theorem 2 and that the plant has 
no reduced order MV regulator than (8). Then, for the SG+MV ST regulator we 
have 

lim 0(f) = v6 (8.7-45) 

for some random variable v. 

Note that the order of the MV regulator (8) can be always reduced under (9b) 
unless h = max (n a ,rib,n c ). 



Minimum— Variance Self Tuning Trackers 

We consider again the SISO CARMA plant (1), (6), (26), our interest now being on 
finding the control law which minimizes in stochastic steady-state the performance 
index 

C = £{[y(t + T)-r(t + r)} 2 \y\r t+T ) (8.7-46) 

where r a preassigned output reference. Along the lines of Sect. 7.3 we find for the 
optimal control law 

Q T (d)B (d)u(t) = -G T (d)y(t) + C(d)r(t + r) (8.7-47) 

which reduces to (2) in the pure regulation problem r(t) = 0. The control law (47) 
which will be referred to as Minimum- Variance (MV) control, yields {Cf. Problem 
7.3-1) a stable closed-loop system if and only if Bq{<£) is strictly Hurwitz, i.e. the 
plant is minimum-phase. This is therefore an assumption that we shall adopt 
throughout the section. 

As with MV regulation, the next step is to derive an implicit model for the 
controlled system. To this end, remind the prediction form (3) of (1). Take into 
account that by (47) in closed-loop 

y(t) = r(t) + Q T (d)e(t) (8.7-48) 

to find 



y(t + r) = Q T (d)B (d)u(t) + (8.7-49) 
G T (d)y(t) + [1 - (7(d)] r(t + r) + Q T (d)e(t + r) 
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or denoting the tracking error by 

e y (t) := y(t) - r(t) 



(8.7-50) 



e y (t + r) = Q T (d)B (d)u(t) + 

G T (d)y(t) - C(d)r(t + r) + Q T (d)e(t + r) 



(8.7-51) 



Therefore, we conclude that the MV-controlled system evolves in accordance with 
the linear regression model (51), where the reference r appears as an exogenous 
input. The coefficient of the polynomials of this model can be estimated by a 
linear regression algorithm, e.g. SG or RLS. In fact (51) can be rewritten as 



e y (t) = <p'(t - t)6 + Q T (d)e(t) 



fit) :-- 



6 := [ a ■■■ a Uy /3 ■■■ nu 
while (47) is equivalent to 



1 ci 



<p'{t)0 = 



(8.7-52a) 

(8.7-52b) 
(8.7-52c) 

(8.7-53) 



In (52) n y 



and n c denote the degrees of the polynomials G T (d), 



Q T (d)B (d) and C(d), respectively, while on and the coefficients of G T {d) and, 
respectively, Q T (d)Bo(d). The discussion above, along with the convergence results 
on SG+MV ST regulation, motivate the following ST controller. 



e(t) = e y (t)-<p'(t-T)9(t-l) 

q(t) = g( t -l) + H^t-l)!^ 



a > 



(8.7-54a) 

(8.7-54b) 
(8.7-54c) 



with t e 7L\, 9(0) e R™ 8 , and q(l — r) > 0. Further, u(t) is chosen according to 
the Enforced Certainty Equivalence as follows 



<p'{t)9{t) = 



or 



u{t) = - 



A)(t) 

[ a (t) ••• a ny (t) ^(t) 



Pn u (t) C (t) Ci(t) 



where 



(8.7-55a) 
(8.7-55b) 

Cn c (t) ]'*(*) 

(8.7-55c) 



The above algorithm will be referred to as the SG+MV ST controller. Using the 
stochastic Lyapunov function method as in Theorem 1, it can be shown that this 
adaptive controller applied to the CARMA plant is self-optimizing. 

Theorem 8.7-3. (Global convergence of the SG MV ST controller) Pro- 
vided that {?"(&) is a bounded sequence, under the same conditions of Theorem 
2 with a > 0, we have for the SG-MV ST algorithm applied to the CARMA plant: 
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• ||0(i)-0|| converges a.s.; (8.7-56) 

• The input paths satisfy 

1 t_1 

limsup- V"u 2 (fc) < oo a.s.; (8.7-57) 

t ^°° 1 k=o 

• Self-optimization occurs 

1 * 

lim - V [y(k) - r(k)} 2 = cr 2 a.s. (8.7-58) 

t— >oo t 

k=l 

Note that no convergence result for 9{t) is included in Theorem 3. Nonetheless, 
a result similar to the one in Fact 1 can be proved. However, since the regressor 
of the SG-MV control algorithm includes reference samples, here conditions under 
which self-tuning occurs, viz. lim^oo 8(t) — v8 for some random variable v, involve 
that the reference trajectory be sufficiently rich in an appropriate sense [KV86] . 

Main points of the section CARMA plants under Minimum- Variance control 
admit implicit models of linear-regression type. This fact can be exploited so as to 
properly construct implicit Minimum- Variance ST control algorithms whose global 
convergence can be proved via the stochastic Lyapunov equation method. 



8.8 Generalized Minimum— Variance Self— Tuning 
Control 

We next focus our attention on self tuning schemes whose underlying control prob- 
lem is the Generalized Minimum Variance (GMV) control. For a SISO CARMA 
plant 

A{d)y{t) = d T B (d)u(t) + C(d)e(t) (8.8-1) 
we found in Sect. 7.3 that the GMV control law equals 



-C{d) + Q T {d)B {d) 



u(t) = -G T (d)y(t) + C(d)r(t + r) 



$.8-2) 



and the ^-characteristic polynomial of the GMV controlled system is given by 



Xd{d) = 7 



A(d) + B (d) 



C{d) 



(8.8-3) 



Provided that Xci(d) is strictly Hurwitz, (2) minimizes in stochastic steady-state 
the conditional expectation 

C = £ {[y(t + t) - r(t + r)] 2 + pu 2 {t) \ y\r t+T ) (8.8-4) 

In order to exploit the result obtained on SG+MV ST control, we show next that 
GMV control is equivalent to MV control for a suitably modified plant. To see this, 
by using (7-3) rewrite the polynomial on the L.H.S. of (2) as follows 



-C{d) + Q T {d)B {d) 



Qr(d) 



B (d) + ^A{d) 



P_ 

br 



d T G T (d) (8.8-5) 
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Modified Plant 



Plant 




[ -G T C ] 
fc + Q T B 



GMV Controller 



r(t + t) 



t (t) 



e(t) 



Plant 



Or 



y(t) 



MV Controller 



l -G T C ] 




Qr 






r(t + t) 





Figure 8.8-1: The original CARMA plant controlled by the GMV controller on the 
L.H.S. is equivalent to the modified CARMA plant controlled by the MV controller 
on the RH.S. 



Hence, the GMV control (2) is equivalently given by 



QAd) 



B (d) + ^A{d) 



u(t) = -G T (d)y{d) + C{d)r{t + r) 



y(t) :=y(t) + fu(t-T) 



(8.8-6) 



Further, 



A(d)y(t) = A{d) 



y (t) + ± u (t - T ) 

Ot 



= d T 



B (d) + fA(d) 



u(t) + C{d)e{t) 



(8.8-7) 



By comparison with (7-47), it is immediate to recognize that (6) is the same as 
the MV control law for the modified plant (7) with output y(t). This conclusion is 
depicted in Fig. 8.8-1. 

The equivalence between GMV and MV can be used to design globally conver- 
gent adaptive GMV controllers. If b T is a priori known, this can be done as follows. 
Modify the plant as shown in Fig. 1. Then, use the SG+MV ST algorithm (7-55), 
(7-56), for the modified plant, viz. replace in all the equations y(t) with the new 
output variable y(t). Hence, for the adaptive pure regulation problem the conclu- 
sions of Theorem 7-1 and 7-2 are directly applicable to this new situation. Notice, 
however, that the minimum-phase plant condition means here that the polynomial 



B (d) + l T A{d) 
below. 



is strictly-Hurwitz. For adaptive GMV tracking see Problem 2 
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Main points of the section GMV control is equivalent to MV control of a suit- 
ably modified CARMA plant. If the B(d) leading coefficient b T is known, this fact 
can be exploited to construct a globally convergent SG+GMV control algorithm 
by the direct use of the results of Sect. 7. 

Problem 8.8-1 (Global convergence of the SG-GMV controller) Express the conclusions of the 
analogue of Theorem 7-3 for the SG-GMV controller, based on the equivalence between GMV 
and MV, applied to the CARMA plant (1). 

Problem 8.8-2 (SG-GMV controller with integral action) Following the discussion throughout 
(7.5-23)-(7.5-28), construct a globally convergent SG-GMV controller for the CARIMA plant 
(7.5-26) yielding at convergence an offset-free closed-loop system and rejection of a constant 
disturbance. 

Problem 8.8-3 Suppose that b T is unknown and, hence, a guess b T is used in (6) instead 
of b T . Find the closed-loop ci-charactcristic polynomial and the cost minimized in stochastic 
steady-state by this new control law applied to the plant (1). 

Problem 8.8-4 (MV regulation with polynomial weights) Find the MV regulation problem 
equivalent to the following GMV regulation problem: minimize in stochastic steady-state the 
conditional cost C = E j [W y (d)y(t + r) + W u (d)u(t)] 2 | y'J, where W y (d) and W u (d) are poly- 
nomials such that W y (0) = 1 and W u (0) ^ 0, for the CARMA plant (1). 

8.9 Robust Self-Tuning Cheap Control 

As pointed out earlier in this chapter, the motivation behind the development 
of adaptive control was the need to account for uncertainty in the structure and 
parameters of the physical plants. So far we have found that for self-tuners with 
underlying myopic control several reassuming results are available, provided that 
the physical plant is exactly described by the adopted linear system model once its 
unknown parameters are properly adjusted. These, together with similar results 
for MRAC systems, were mainly obtained in the late 1970's — early 1980's. At the 
beginning of the 1980's it became clear that an adaptive control system designed for 
the case of an exact plant model can become unstable in the presence of unmodcllcd 
external disturbances or small modelling errors . As a result, the issue of robustness 
of adaptive systems has become of crucial interest. 

In order to obtain improvements in stability, various modifications of the algo- 
rithms originally designed for the ideal case have been proposed. In this connection, 
projection of the parameters estimates onto a given fixed convex set can be adopted 
to prove stability properties. This, however, requires that appropriate prior knowl- 
edge on the plant is available. Consequently, the use of projection is hampered 
whenever such a prior knowledge is unavailable. Another way to deal with mod- 
elling errors is to combine in the identifier data normalization and dead-zone. Data 
normalization is used to transform a possibly unbounded dynamics component into 
a bounded disturbance. The dead-zone facility is used to switch-off the estimate 
update whenever the prediction error is small comparatively to the expected size 
of a disturbance upperbound. 

In this section we discuss a possible approach to the robustification of STCC 
based on data normalization and dead-zone. Further, data prefiltcring for identi- 
fication and dynamic weights for control synthesis, being very important in prac- 
tice, are also described in some detail. STCC has been discussed under model 
matching conditions in Sect. 5 and 6. Similar robustification tools will be used in 
Sect. 9.1 where we shall consider indirect adaptive predictive control in the presence 
of bounded disturbances and neglected dynamics. 
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8.9.1 Reduced Order Models 

We consider a SISO plant of the form 

A°(d)y(t) = B°(d)u(t) + A°(d) [n(t) + u(t)] (8.9-la) 

where 7r(i) and uj(t) denote a predictable and, respectively, a stochastic disturbance. 
In particular, we assume that 

U(d)ir{t) = (8.9-lb) 

for a monic polynomial TL(d) with all its roots of unit multiplicity and on the unit 
circle. In order to take into account constant disturbances, we also assume that 
n(l) = 0, i.e. 

A(d) | 11(d) (8.9-lc) 
with A(d) = 1 — d. Eq. 1 can be rewritten as follows 

A (d)y(t) = B (d)u(t) + A (d)w{t) (8.9-2a) 

where 

A (d) := A°(d)U(d) B (d) := B°(d)U(d) (8.9-2b) 

We consider the situation wherein (2) is represented by a lower order model as 
follows 

A(d)y(t) = B(d)u(t) + n{t) (8.9-3a) 

where 

A (d) = A(d)A u (d) B (d) = B{d)B u {d) (8.9-3b) 

and 

, , Bid) \B U {d) - A U (d)} lfn , nnn , 

n{t) = M(d) <t] + A{ - d ^ ( 8 - 9 " 3c ) 

In (3b) A u (d) and B u (d) are monic polynomials and the superscript "u" stands 
for "unmodelled" . Since B u (d) is monic, ordi?(c?) = ordB (d). Hence, the plant 
I/O delay is retained by the modelled part of (3a). We point out that the model 
(3a)-(3c) is another way of writing (2). However, we shall use (3a) in adaptive 
control without taking into account (3c). Specifically, the idea is to identify the 
parameters of the reduced-order model (3a), viz. the polynomials A(d) and B(d), 
via a standard recursive identification algorithm, and for control design use only the 
estimated A(d) and B(d), in place of A Q (d), B (d) and possibly the properties of 
the process w. In this way, we design a reduced-complexity controller by neglecting 
the unmodelled dynamics of the plant. In (3a) the latter are accounted for by the 
term n(t) as given in (3c). 



8.9.2 Prefiltering the Data 

It is crucial to realize that the factorizations (3b) are not unique. Then, it follows 
that there are many candidate polynomial pairs A{d), B(d) for the reduced-order 
model (3a). To each candidate pair there corresponds an unmodelled dynamics 
disturbance n(t) via (3c). Since our ultimate goal is to use (3a) for control design, 
our preference must go to those polynomial pairs making n(t) small in the useful 
frequency-band. In fact, it is to be expected that the closer the approximation of 
A(d), B(d) to A (d), B (d) inside the useful frequency-band will be, the better the 
reduced-complexity controller will behave. Since A(d) and B(d) are obtained via 
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a recursive identification algorithm, e.g. RLS, whereby the output prediction error 
is minimized in a mean -square sense (Cf. (6.3-32)), an effective reduction of the 
mean square value of n(t) within the useful frequency-band can be accomplished 
via the following filtering procedure. 

The data to be sent to the identifier j/z,(f) and Mz,(t) are obtained by prefiltering 
the plant I/O variables 

y L (t) = L(d)y(t) u L {t) = L(d)u{t) (8.9-4a) 

where L(d) is a stable low-pass transfer function which rolls-off at frequencies 
beyond the useful band. In such a way, the identifier fits a model to the I/O 
process {yi,(t), UL(t — r)}, r being the plant I/O delay, described by the difference 
equation 

A(d)y L (t) = B{d)u L (t) + n L (t) (8.9-4b) 

with 7i£,(f) := L(d)n(t). Note that nt(i) has most of its power within the useful 
frequency-band. Hence, the identifier, choosing A{d) and B(d) so as to minimize 
the overall power of n^t), is forced by the prefiltering action of L{d) to select, 
amongst the candidate polynomial pairs, one which can satisfactorily fit A Q (d), 
B (d) within the useful frequency-band. 



8.9.3 Dynamic Weights 

Having the identified polynomials A(d) and B(d) at a given time , we could proceed 
to design the control law according to the Enforced Certainty Equivalence proce- 
dure by referring to the model (3a) under the assumption that n(t) — ri£,(t) / ' L(d) 
is negligible or white, viz. its power being equally distributed over all frequencies. 
However, since our strategy has been to select A(d) and B(d) in order to well ap- 
proximate A Q {d) and B {d) within the useful frequency-band, we must expect that 
n(t) is large at high frequencies. Then, in order to reduce the effect of the neglected 
dynamics on the controlled system, we take advantage of the considerations made 
after (7.5-29). To this end we consider the filtered variables 

y H (t) := H(d)y(t) u H {t) := H{d)u{t) (8.9-5a) 

with H(d) a monic strictly-Hurwitz high-pass polynomial, and the related model 

A(d)y H (t) = B(d)u H {t) + H(d)n(t) (8.9-5b) 

We know that n(t) is large at high frequencies. Nevertheless, for control design 
purposes we act as if n were a zero-mean white noise. We then compute the MV 
control law minimizing in stochastic steady-state 

£ {[y H (t) - H(l)r(t)} 2 | y* H } (8.9-6) 

for the plant (5) as if it were a CARMA model. Notice that this is the same as 
computing the MV control law for the dynamically weighted output yn (t) and the 
model (3a) with n assumed to be white. Assuming also, by the sake of simplicity, 
the plant I/O delay r equal to one, by Theorem 7.3-2 we find the MV control law 



B(d)u H (t) + [H(d) - A{d)] y H (t) = H(d)H(l)r(t) 



(8.9-7) 
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Figure 8.9-1: Block scheme of a robust adaptive MV control system involving a 
low-pass filter L(d) for identification, and a high-pass filter H(d) for the control 
synthesis. 



It then follows that, for the output of the controlled system (5), (7), in stochastic- 
steady-state 

y H (t)=H(l)r(t)+n(t) 

or 

H(l) , , n(t) 

- wj r{t) + m (8 ' 9 " 8) 

From (8), we see that the use of a high-pass Hurwitz polynomial H(d), such that 
1 /H(d) rolls-off at frequencies outside the desired closed-loop bandwidth, is ben- 
eficial for both shaping the reference and attenuating the effect of the neglected 
dynamics. 

Fig. 1 depicts a robust adaptive MV control system designed in accordance 
with the above criteria and involving a low-pass filter L(d) for identification and a 
high-pass filter H(d) for the control synthesis. 



8.9.4 CT-NRLS with dead-zone and STCC 

The adoption of data prefiltering for identification and dynamic control weights 
as discussed so far can be very effective for counteracting a plant order underes- 
timation. A situation this which is almost the rule in practice, being usually the 
physical system to be controlled more complex than the model adopted for control 
synthesis purposes. Under such circumstances, the two filtering actions above are, 
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however, insufficient to construct globally convergent adaptive control systems. We 
sec hereafter that a self-tuning cheap control system can be made globally stable 
in the presence of neglected dynamics by using an RLS identifier equipped with 
both data normalization and dead-zone. The latter facility is used to freeze the 
estimates whenever the absolute value of the prediction error becomes smaller than 
a disturbance upperbound. For the sake of simplicity, we do not explicitly use data 
prcfiltering or dynamic weights in the scheme which is adopted for analysis, being 
always possible to add suitable filtering actions so as to robustify the adaptive con- 
trol system as indicated earlier in this section. As an additional simplification, we 
assume that the plant I/O delay r equals one. Then, we consider the plant 

A(d)y(t) = B(d)u(t) + n(t) (8.9-9a) 

where 

A(d) := l + ai d+--- + a na d na (8.9-9b) 

B(d) := hd + ■ ■ ■ + b nb d n " (8.9-9c) 

and, similarly to (3a), n(t) includes, the effects of unmodcled dynamics. To assume 
from the outset that n(t) is uniformly bounded would be very limitative in that 
(Cf. (3c)) n(t) depends on the unmodeled dynamics and the control law. We assume 
instead that 

\n(t)\<fj,m (t-l), ieSi (8.9-9d) 
where the nonnegative real m (t) is given by 

m (t) = am (t - 1) + \\tp(t)\\ (8.9-9e) 

for a € [0, 1), < fj, < oo, and 

*(*-!)==[ {-VI-IS K=U'f ( 8 ' 9 " 9f ) 

Next problem proves that, if n(t) is related to u(t) (Cf. (3c)) by a stable transfer 
function and u(t) is uniformly bounded, then (9) is satisfied. 

Problem 8.9-1 Consider a disturbance n(t) as in (3c) with w(t) uniformly bounded and A u (d) 
strictly Hurwitz. Show that 

\n(t)\ < + fJ,m u (t - 1) 
m u (t) = am u (t - 1) + \u(t)\ 
for a e [0, 1) and \x w and fj, positive bounded reals. 

Setting 

6~ [ ai ••• a na h ••• b nb y (8.9-10a) 

(9a)-(9c) become 

y(t) =ip'(t-l)6 + n(t) (8.9-10b) 
If as in (8.6-1) we introduce the normalization factor 

m(t — 1) := max {m, m (t — 1)} , m>0 (8.9-lla) 

and the normalized data 

7(t):= _W(*)^ x{t - 1) : = n(t):=^- (8.9-llb) 

,w m(i-l) v ' m(t-l) w m(t-l) K ' 
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we get 



with 



7(t) =x'(t-l)0 + n(t) 



\n(t)\ 



n(t) 



m(t - 1) 



< 



n(t) 



m (t - 1) 



(8.9-llc) 



(8.9-12) 



Thus data normalization and property (9d) allow us to transform (10b) with the 
possibly unbounded sequence {n(t)} into (11c) with the uniformly bounded distur- 
bance fi(t). 

We next consider the following identification algorithm. 



CT-NRLS with relative dead-zone (RDZ-CT-NRLS) (Cf. (6-19)) 

a(t)P(t-l)x(t-l) 



6{t) =6(t-l) + K(t) 



P(t) = 



X(t) 

A(t) = 1 - 



P(t - 1) - K(t) 



1 + a{t)x'{t - l)P(t - l)x{t - 1) £ ^ 
a(t)P(t - l)x(t - l)x'(t - l)P(t - 1) 



1 + a(t)x'(t - l)P{t - l)x{t - 1) 
K(t) a{t)x'{t- l)P 2 (t- l)x(t- 1) 



Tr[P(0)] 1 + a{t)x'{t - l)P(t - l)x(t - 1) 
where a(t) is as in (5-8c), 

e(t) = y(t)-<p'(t-l)e(t-l) 
e(t) 



m(t-l) 



and 



«(*) 



^e(o )T ^) 
o 



j(t)-x'{t-l)6{t-l) 



if \e(t)\ > (l + e) 1 ^ 
otherwise 



(8.9-13a) 
(8. 9- 13b) 
(8. 9- 13c) 

(8. 9- 13d) 
(8.9-13e) 

(8.9-13f) 



with e > 0. The algorithm is initialized from any 9(0) with #n a +i(0) ^ and any 
P(0) = P'(0) > 0. The dead-zone facility (13f) is also called a relative dead-zone 
in that it freezes the estimate whenever the absolute value of the prediction error 
becomes smaller than a quantity depending on the norm of the regression vector. 



Problem 8.9-2 Consider the estimation algorithm (13). Let 

V(t) :=0 , (t)P- 1 (jt)Ht) 

and 6{t) := 6{t) - 9. Show that 

e 2 (t) n 2 (t) 



V(t) = A(t) I V(t - 1) - K(t)a(t) 



1 + Q(t) 1 + [1 - it(t)]Q(t) 



with 



Q(t) := a(t)x'(t - l)P(t - l)x(t - 1) 
[Hint: Use (5.3-16) to show that 



p-\t) = X(t) 



P- X (t- l)+K(t)o(t) 



x(t- i)x'(t - 1) 



l + [l-K(t)]Q(t). 



(8.9-14) 

(8.9-15a) 

(8.9-15b) 
(8.9-16) ] 
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It is easy to check (15a) can be rewritten as follows 

V(t) = A(i){v(f-1)- (8.9-17) 



K(t)a(t) 



l+[l-(l±s)«(t)]Q(t) 2 



l + ej [l + Q(t)]{l+[l-K(t)]Q(t)} 
e 2 (t) - (1 + e)n 2 (t) 



(l + e){l+[l-«(t)]Q(t)}. 

This equation shows that, for every e > 0, the dead-zone facility (13f) insures that 
{V(t)}^ is monotonically nonincreasing 

V(t) < \{t)V(t-l) <V(t-l) (8.9-18) 



where the latter inequality follows from (13c). Hence, V(t) converges as t — > oo. 
We can then establish the following result. 

Proposition 8.9-1 (RDZ-CT-NRLS). Consider the RDZ-CT-NRLS algorithm 
(13) with the data generating mechanism (10)-(12). Then, the following properties 
hold: 

i. Uniform boundedness of the estimates 

ll^)H 2 < A ^™)] l|g " (0)l12 (8 - 9 " 19) 

ii. Finite-time convergence inside the dead-zone, viz. there is a finite integer T\ 
such that for all t>T\ 

K(t) = (8.9-20) 

or 

\e(t)\ < (l + e) 1/2 /i (8.9-21) 
Hi. Estimate convergence in finite- time 



6{i) = 6»oo := lim 0(k), Vt > T x (8.9-22) 

k— >oo 



Proof 

i. Eq. (21) follows by the same argument used to get (4-7). 

ii. Let 



T := U 6 7L + | n{t) > 

It will be proved by contradiction that T has a finite number of elements. Suppose that 
this is untrue. Then, there is a sequence {tfcj^Lj^ in T with lira^^^ = oo. Then, from 
(20) it follows that Wra^^^ £ 2 (tfc) = 0. This in turn implies that there is a ti £ T such 
that for all 6 T, tj. > ti, |e(tfc)| < (1 + e) 1 / 2 /!. Consequently re(tfc) = or i^ST: a 
contradiction. 

iii. Eq. (24) follows trivially from ii. 



STCC Given the estimate 0(t) of 6 obtained by the dead-zone CT-NRLS algo- 
rithm, the control variable u(t) is selected by solving w.r.t. u(t) the equation 



(8.9-23) 
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We shall refer to the algorithm (13), (23), when applied to (9), as the reduced-order 
STCC or the RDZ-CT-NRLS + CC algorithm. By the choice (23), (21) yields 

\v(*)-r(t)\ „ n , ,i/2 



m(t - 1) 



<{l + eY^fi , Vt>Ti 



If m(t — 1), t G Zi, is uniformly bounded, last inequality indicates that the tracking 
error is upperbounded by (1 + e) 1 / 2 times the disturbance upperbound fim(t — 1) 
for n(t) as given by (12). 

In order to prove the crucial issue of boundedness, we resort to a variant of the 
Key Technical Lemma based on (21). We first show that the linear boundedness 
condition (3-10) holds under the following assumption. For the plant (1), among 
the reduced-order models (9) with all the stated properties there is one such that 

L '" l] is stably invcrtible (8.9-24) 



dA{d) 

As indicated after (5-15), this condition, and the fact that by (23) e(t) = y(t)—r(t), 
implies that 

\\ip(t - 1)|| < ci + c 2 max \s(k)\ (8.9-25) 

[l ,i] 

for nonnegative reals ci and c 2 . Then, 

m (t-l) < m (-l) + (l-a)- 1 max ||^(fc-l)|| [(9c)] 

fee[o,t] 

< c 3 + c 4 max |e(fc)| (8.9-26) 
ke[i.t] 

We are now ready to establish global convergence of the adaptive system in the 
presence of neglected dynamics. 

Theorem 8.9-1. (Global convergence of the RDZ-CT-NRLS + CC al- 
gorithm) Let the "minimum-phase" condition (24) be satisfied and the output 
reference r(t) be uniformly bounded. Suppose that for some nonnegative real ca as 
in (26) 

(l + e) 1 /V<- (8.9-27) 

Then, the reduced-order self-tuning control algorithm (13) and (23), when applied 
to the plant (1), for any possible initial condition, yields that: 

i. y(t) and u(t) are uniformly bounded; (8.9-28) 

ii. There exists a finite T\ after which the controller parameters self-tune in such 
a way that 

\y(t)-r(t)\<{l + e) 1/2 iim(t-\), Vf > T x (8.9-29) 

Proof We use (21), and (25)-(27) to prove i. If e(t) is uniformly bounded, uniform boundedness of 
\\<p(t — 1)|| follows at once from (25). Assume that {e(t)}'^ 1 in unbounded. Then, arguing as in the 
second part of the proof of Lemma 3-1, along a subsequence {t„} such that limt^^oo \e(t n)\ — CO 
and \e(t)\ < \e(t n )\ for t < t n , we have 

n(t n ) = Vt„ > Ti 

or 

e 2 (t ) 

Knl < (1 + e)n 2 Vt„ > Ti (8.9-30) 



m 2 (t n - 1) 
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On the other hand 

Ht n )\ ^ \e(t n )\ 



m(t n — 1) m + m (t n — 1) 

k(tn)| 



m + C3 + C4|e(*n)| 

which implies that 



[(11a)] 
[(26)] 



lim Htn)l - > 1 (8.9-31) 
t n — »oo m(t n — 1) C4 



This inequality contradicts (30) whenever (27) is satisfied. 



It must be pointed out that (27), in order to be satisfied for a large /U, entails 
that (26) holds for a small C4. Tracing back the meaning of C4 (C/. (5-15)), we see 
that, in this respect, the more stably invcrtiblc it is the transfer function in (24), 
the smaller C4 turns out to be. On the other hand, given the transfer function in 
(24), the adaptive system stays stable provided that the disturbance n(t) is linearly 
bounded as in (9d) and (9e) and /1 is small enough to satisfy the complementary 
condition (29). Being proportional to /j,, the relative dead-zone must also be made 
small enough in agreement to the latter remark. To sum up, the practical relevance 
of Theorem 1 is that it suggests that the use of a relative dead-zone, small enough 
to satisfy (27), can make the RDZ-CT-NRLS + CC self-tuning controller robust 
against the class of neglected dynamics for which the upperbound (9) holds. 

Main points of the section Low-pass prcfiltcring of data is instrumental for 
forcing the identifier to choose, among all possible reduced-order models of the 
plant, the ones for which the unmodelled dynamics have reduced effects inside the 
useful frequency-band. Further, high-pass dynamic weighting in control design 
turns out to be beneficial for both reference shaping and attenuating the response 
to the neglected dynamics high-frequency disturbances. 

Under some conditions for the unmodelled dynamics, self-tuning cheap control 
systems can be made robustly globally convergent by equipping the RLS identifier 
with both data normalization and a relative dead zone facility, the latter to freeze 
the estimates whenever the absolute value of the prediction error becomes smaller 
than a disturbance upperbound. The latter is however required to be small enough 
so as not to destabilize the controlled system. 

Problem 8.9-3 (STCC with bounded disturbance) Consider the plant (9) with (9d) replaced by 

\n(t)\ < N < 00 

Next, redefine the normalization factor as in (6-la), and modify in the CT-NRLS identifier (13) 
the dead-zone mechanism as follows 

«(*) = / Ke (°>TTi) if kWI>(i + ^) 1/2 ^ 

^ otherwise 

Show that, with the above modifications for the STCC algorithm (13) and (25) and the plant (1), 
(28) and 

\y(t)-r(t)\<(l + e) 1 /*N 
both hold, under all the validity conditions of Theorem 1 with no need of (27). 
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Thorough presentations and studies of MRAC systems are available in the text- 
books [AW89], [NA89], [SB89], and [But92]. See also [Lan79] and 
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[Cha87]. The early MRAC approach based on the gradient method, dates back 
to about 1958. The difficulties with stability of the gradient method were analyzed 
in [Par66]. The innovative approach of [Mon74], whereby the feedback gains are di- 
rectly estimated and the use of pure differentiators avoided, was thereafter adopted 
to produce technically sound MRAC systems for continuous-time minimum-phase 
plants. However, only in 1978 global stability of the above mentioned MRAC sys- 
tems, subject to additional assumptions on the available prior information, was 
established by [FM78], [NV78], [Mor80] and [Ega79]. The last reference gives also 
a unification of MRAC and STC systems. For Bayesian stochastic adaptive control 
see [LDU72], [DUL73] and, for the continuous-time case, [Hij86]. 

At different levels of mathematical detail, self-tuning control systems are pre- 
sented in the textbooks [GS84], [DV85], [Che85], [KV86], [AW89], [CG91], [WZ91] 
and [ILM92] . For a monograph on continuous-time self-tuning control see [Gaw87] . 
One of the earlier works on the self-tuning idea is [Kal58] whereby least-squares 
estimation with deadbeat control was proposed. Two similar schemes, [Pet70] and 
[WW71], combining RLS estimation and MV regulation were presented at an IFAC 
symposium in 1970. The first thorough analysis of the RLS+MV self tuning regu- 
lator was reported at the 5th IFAC World Congress in 1972 [AW73], showing that 
in this scheme w.s. self-tuning and weak self-optimization occur. GMV self-tuning 
control was presented in [CG75]. However, it was not until 1980 that global con- 
vergence of the STCC, as in Theorem 5-1, was reported by [GRC80]. The global 
convergence proof of Theorem 6-2 of the implicit STCC algorithm (19), (20), based 
on the constant-trace RLS identifier with data normalization of [LG85] is reported 
here for the first time. For global convergence of other STCC algorithms based 
on finite-memory identifiers see also [BBC90b]. In a stochastic setting, the global 
convergence proof of the SG+MV algorithm, as in Theorem 7-1, was first given 
in [GRC81] where MIMO plants with an I/O delay r > 1 are also considered. 
[GRC81] can be considered the first rigourous stochastic adaptive control result 
in a self tuning framework. The extension of Theorem 7-1 as in Theorem 7-2 is 
reported in [KV86] . In [KV86] the geometric properties of the SG estimation al- 
gorithm are used to establish the self-tuning property of the SG+MV regulator in 
Fact. 7-1. For self-tuning trackers see also [KP87]. A global convergence analysis 
of an indirect stochastic self-tuning MV control based on an RELS(PO) estimation 
algorithm is given in [GC91]. 

At the late 1970's — early 1980's it became clear that violation of the ex- 
act modelling conditions can cause adaptive control algorithms to become unsta- 
ble. This phenomenon was pointed out, among others, by [Ega79], [RVAS81], 
[RVAS82], [IK84], and [RVAS85]. To counteract instability and improve robustness 
w.r.t. bounded disturbances and unmodcllcd dynamics, various modifications to 
the basic algorithms have been proposed. Some overview of these techniques can 
be found in [ABJ+86], [Ast87], [OY87], [MGHM88], [IS88] and [ID91]. The ma- 
jor modifications consist of data normalizations with parameter projection [Pra83] , 
[Pra84], cr-modifications [IK83], [IT86a], relative dead-zones with parameter pro- 
jection [KA86]. Another technique to enhance robustness is to inject perturbation 
signal into the plant so as make the regression vector persistently exciting [LN88], 
[NA89], [SB89]. For robust STCC our analysis in Sect. 9 shows that the choice can 
be limited to data normalization and relative dead zone. Most of the above ro- 
bustification techniques and studies do not use stochastic models for disturbances. 
These are in fact merely assumed to be bounded or originate from plant undermod- 
clling. For an exception to this, see [CG88], [PLK89], [Yds91] and the stochastic 
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analysis of a STCC robustified via parameter projection onto a compact convex set 
reported in [RM92]. 

For data prefiltering in identification for robust control design see [RPG92] and 
[SMS92] . 
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CHAPTER 9 



ADAPTIVE PREDICTIVE 
CONTROL 



In this chapter we shall study how to combine identification methods and multistep 
predictive control to develop adaptive predictive controllers with nice properties. 
The main motivation for using underlying multistep predictive control laws in self 
tuning control is to extend the field of possible applications beyond the restrictions 
pertaining to single-step-ahead controllers. In Sect. 1 we first study how to con- 
struct a globally convergent adaptive predictive control system under ideal model 
matching conditions. To this end, the use of a self-excitation mechanism, though 
of a vanishingly small intensity, turns out to be essential to guarantee that the 
controller self tunes on a stabilizing control law. We next study how to robustify 
the controller for the bounded disturbances and neglected dynamics case. In this 
case, along with a self excitation of high intensity, low-pass filtering, normaliza- 
tion of the data entering the identifier, as well as the use of a dead-zone, become 
of fundamental importance. 

From Sect. 2 to Sect. 6 we deal with implicit adaptive predictive control. In 
Sect. 2 we show how the implicit description of CARMA plants in terms of linear- 
regression models, which is known to hold under Minimum- Variance control, can 
be extended to more complex control laws, such those of predictive type. In Sect. 3 
and 4 this property is exploited so as to construct implicit adaptive predictive 
controllers. In Sect. 5 one of such controllers, MUSMAR, which possesses attrac- 
tive local self optimizing properties, is studied via the ODE (Ordinary Differential 
Equation) approach to analysing recursive stochastic algorithms. Finally, Sect. 6 
deals with two extensions of the MUSMAR algorithm: the first imposes a mean 
square input constraint to the controlled system; the second is finalized to exactly 
recover the steady-state LQ stochastic regulation law as an equilibrium point of 
the algorithm. 

9.1 Indirect Adaptive Predictive Control 

9.1.1 The Ideal Case 

Consider the SISO plant 

A(d)y(t) = B(d)u(t) + A(d)c(t) 



(9.1-la) 
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where: y(t) and u(i) are the output and, respectively, the manipulable input; c(t) = 
c is an unknown constant disturbance; and 

A(d) = 1 + aidH h a„ a d" a a„ a ^ 1 , 

Note that here the leading coefficients of B(d) are allowed to be zero, and hence 
an unknown plant I/O delay, t, 1 < r < nj,, can be present. The plant can be also 
represented in terms of the input increments Ju(t) := u(t) — u(t — 1) by the model 

A(d)A(d)y(t) = B(d)5u(t) (9.1-lc) 

with A(d) = 1 — d. The main goal is to develop a globally convergent adaptive 
controller based on SIORHC, the predictive control law of Sect. 5.8 which, for 
convenience, is restated hereafter. 
Given the state 

«(*)==[ (y$- na )' (5<ZT +1 )'\ (9-1-2) 

find, whenever it exists, the input increment sequence fat^t+r) to the plant (lc) 
minimizing 

t+T-l 

J(s(t),Su [tit+T) ) = [e 2 v (k+l) + pSu 2 (k)] , p>0 (9.1-3) 

k=t 

under the constraints 

Htr+n-2 = O n -i Vltr+n-i = T.{t + T) (9.1-4) 

In the above equations: e y (k) := y(k) — r(k), r(k) being the output reference to be 
tracked; and, r(k) := [ r(k) ■ ■ ■ r(k) ] e R". Then, the plant input increment 
6u(t) at time t is set equal to 5u(t) 

Su{t)=Su{t) (9.1-5) 
The SIORHC law is given by (5.8-45) 

6u(t) = -e'jAr 1 ) (I T - QLM^ 1 ) W[ [T lS (t) - r*+^] 

+Q[T 2 s{t)-r{t + T)]} (9.1-6) 

with the properties stated in Theorem 5.8-2. In particular, in order to insure that 
(6) is well-defined and that it stabilizes the plant (1) we shall adopt the following 
assumptions. 

Assumption 9.1-1 

• A(d)A(d) and B(d) are coprime polynomials. (9.1-7) 

• n := max{n a + 1, rib} is a priori known. (9.1-8) 

• T>n. (9.1-9) 

□ 
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It is instructive to compare the above assumptions with Assumption 8.5-1 adopted 
for the implicit STCC (8.5-8), (8.5-9). First, here there is no need of assuming 
that the plant is minimum-phase and that its I/O delay is known. As repeatedly 
remarked, these are limitative assumptions which strongly reduce the range of 
possible applications. On the other hand, here the plant order n, as opposite to 
an upperbound n, is supposed to be a priori known. This is a quite limitative 
assumption as well. At the end of this section, however, we shall study how to 
modify the adaptive algorithm so as to deal with the common practical case of 
a plant whose true order n exceeds the presumed plant order. It is, nonetheless, 
convenient to begin with studying adaptive predictive control in the present ideal 
setting. 



I. Estimation algorithm 



In order to possibly deal with slowly time-varying plants, an estimation algorithm 
with finite data-memory length is considered. In particular, the CT-NRLS of 
Sect. 8.6 is chosen because of its nice properties established in Theorem 8.6-1. The 
aim is to recursively estimate the plant polynomials A(d) and B(d) and use these 
estimates to compute at each time-step the input increment Su(t) in (6). In order 
to suppress the effect of the constant disturbance c on the estimates, the plant 
incremental model is considered for parameter estimation 



with Sy(t) := y(t) 



A{d)5y{t) = B{d)5u{t) 
y(t — 1). Defining 



t E TL X 



p(t-l) 



t-i 

t—n a 



9* := [ aj 



b„ c 



+i 



m(t — 1) := max {to, \\ip(t — l)j|} 
and the normalized data 

Sy(t) 



<S 7 (f) " 



m(t - 1) 



x(t-l) r- 



e IT" 

TO > 0, 

¥>(*-!) 
m(t- 1) 



we have 



5y(t) = <f/(t-l)F 5 1 {t) = x'{t-l)9* 



(9.1-10a) 



(9.1-10b) 

(9.1-lOc) 
(9.1-lla) 



(9.1-llb) 



(9.1-12) 



Note that, in contrast with 9 which was used in Chapter 8 to denote any vector 
in the parameter variety O C R™" , here the symbol 9* denotes the unique vector 
in IT" satisfying (12), with n g = 2n - 1. The following CT-NRLS algorithm 
(Cf. Sect. 8.6) is used to estimate 9*: 



9{t) = 9{t-l) + 



P{t-\)x{t-\) 



l + x'{t-l)P{t-l)x(t-l) 
= 9{t - 1) + \{t)P{t)x{t - l)e(t) 



p( t ) = 



A(t) 



e(t) = 5~f{t)-x'{t-T)6(t-T) 

P(t - l)x(t - l)x'(t - l)P(t - 1) 



P(t-1)- 



l + x'(t-l)P(t-l)x(t-l) 



(9.1-13a) 

(9.1-13b) 
(9.1-13c) 
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\(t)-l 1 *'{t-l)P\t-l) X {t-l) 

XW 1 Tv[P(0)}l + x'(t-l)P(t-l)x(t-l) (9 - 1_13d) 

The algorithm is initialized from any P(0) = P'(0) > and any 6*(0) € H" 8 such 
that Ao(d)A(d) and B (d) are coprime, A G (<i) and B D (d) being the polynomials in 
the plant model (10a) corresponding to the parameter vector 9(0). The algorithm 
(13) still enjoys the same properties as the standard CT-NRLS in Theorem 8.6-1 
(Cf. Problem 8.6-5). For convenience, we list the ones which will be used hereafter. 

• \\9(t)\\ < M e < oo, VteS+ (9.1-14) 

• lim 9(t) = 9(oo) (9.1-15) 

• lim \\9(t) -9(t-k)\\ =0, VA: e 71+ (9.1-16) 

t— ► oo 

• lim 5(t) = => 9(oo) = 9* (9.1-17) 
5(t) :=f[X(j) 

3=1 

• lim e(t) = (9.1-18) 

According to the standard indirect self tuning procedure based on the Enforced 
Certainty Equivalence, the desired adaptive control algorithm should be completed 
as follows. After estimating 

9(t)=[ ai (t) ••• a na (t) h(t) ••• b na+1 (t) ]' (9.1-19a) 

construct the polynomials A t (d)A(d) and B t (d) where 



A t (d) := 1 + ai (t)d + ■ ■ ■ + a na (t)d n * 
B t (d) := b 1 (t)d+--- + b na+1 (t)d n «+ 1 



(9.1-19b) 



Next, from A t (d)A(d) and B t (d) compute the matrices W(t) and T(t) as in (5.5-9) 
and (5.5-10), respectively. Partition W(t) and T(t) as indicated in (5.5-11) and 
(5.5-12) to obtain W\(t), W 2 (t), Tx(t), T 2 (t), M(t) and Q(t). Finally, determine 
the next input increment 5u(t) by using (6) once W\, W2, L, T\, T 2 , M and Q are 
replaced by W^t), W 2 (t), L(t), Ti(t), T 2 (t), M(t) and Q(t), respectively. 

This route, however, cannot be safely adopted without modifications. One rea- 
son is that the estimation algorithm (13) docs not insure that 
A t (d)A(d) and B t (d) be coprime at every i e 2Z, and, hence, boundedness of the 
matrix Q(t). In fact, if A t (d)A(d) and B t (d) are not coprime, L(t) does not have 
full row-rank and, in turn, Q(t) = L'(t) [L(i)M _1 (t)L'(t)] 1 does not exist. Even 
if the control law is modified so as to insure boundedness of Q(t) when A t (d)A(d) 
and B t (d) become noncoprime, boundedness of the controller parameters must be 
also guaranteed asymptotically. One approach that has been often suggested to this 
end is to constrain 9(t) to belong to a convex admissible subset of H n " contain- 
ing 6** and whose elements give rise to coprime polynomials A t (d)A(d) and B t (d). 
This can be achieved by suitably projecting 9(t) in (13) onto the above admissible 
subset. Since in most practical cases the choice of such a subset appears artificial, 
we shall follow a different approach. It consists of injecting into the plant, along 
with the control variable, an additive dither input whenever the estimate 9(t) turns 
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out to yield a control law close to singularity. In the ideal case, such an action, 
if well-designed, turns out to be very effective. In fact the dither input, despite 
that it turns off forever after a finite time, drives the estimate 9{t) to converge 
far from any singular vector. Such a mode of operation will be referred to as a 
self- excitation mechanism since the dither is switched on only when the controller 
judges its current state close to singularity. The resulting control philosophy is 
then of dual control type (Cf. Sect. 8.2). 

We next proceed to construct a self-excitation mechanism suitable for the 
adopted underlying control law. To this end we have to define a syndrome accord- 
ing to which the controller detects its pathological states. Consider assumption 
(7a). It implies that 

E(L) := (n) " W(EW = e (0 ' 1] (9 ' 1_20a) 

if L is the matrix in the SIORHC law (6) corresponding to the true plant parameter 
vector 9* . Let now <c be any positive real such that 

< c < <r* (9.1-20b) 

In practice, if no prior information is given on the size of <;*, <; can be chosen to 
equal any positive number arbitrarily close but greater than the zero of the digital 
processor implementing the adaptive controller. Given an estimate 9(t) of 9*, the 
related syndrome can be defined to be E(9(t)) := E(L(i)), if L(t) denotes the matrix 
in (6) when the SIORHC law is computed from 9(t). The estimate 9(t) is judged 
to be pathologic whenever E(9(t)) < q. In this way the set of admissible 9(t), viz. 
the ones for which E(9(t)) > <r, includes 9* and all estimates yielding a bounded 
SIORHC law. We are now ready to proceed to construct the remaining part of the 
adaptive controller. 

II. Controller with self excitation 

The control law is retuned every ./V sampling-steps with 

N>4n-l (9.1-21) 

even if the plant parameter estimate 9{t) is updated at every sampling time. The 
main reason for doing this is to keep the analysis of the adaptive system as simple 
as possible. For the sake of simplicity, we shall indicate that a matrix M depends 
on the estimate 9(t) by using the notation M(t) in place of M(9(t)). 
If t = {k- 1)N+1, k e 7L U 

i. Form A t (d)A(d) and B t (d) by using the current estimate 9(t); 

ii. Compute the matrices W(t) and T(t) via (5.5-9) and (5.5-10); 

iii. Partition W(t) and T(t) so as to obtain W^t), L(t), W 2 (t), T^t), T 2 (t), 
M(t), and Q(i); 



iv. If 

3(0(i)) > ? (9.1-22) 
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(fc-l)JV+l kN-2n+l kN kN+1 



, inject j 
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, ,. sell excitation , ,. 

evaluation ^ needed evaluation 



Figure 9.1-1: Illustration of the mode of operation of adaptive SIORHC. 



compute the plant input increments Su(t), t < t < t + N, by using (6) with 
W u W 2 , L, r 1; T 2 , M and Q replaced by W x {t), W 2 (t), L(t), T^t), T 2 (t), 
M(t) and Q(t), respectively If 

3(0(t)) < ? (9.1-23) 
self-excitation is turned on, viz. set 

5u(t) = Su°{t) + tj(t) , t<r<t + N (9.1-24) 

t](t) = v(k)S T:k N-2n+i (9.1-25) 

where: Su°(t) is either given by (6) computing Q(t) via any pseudoinverse 
(Cf. (5.5-2)), or, if L(t) = 0, by the same control law over the previous 
time interval ((fc — 2)N, (k — l)N]; t](t) is the dither component due to self- 
excitation with 5 Tt i the Kronecker symbol and u{k) a scalar specified by the 
next Lemma 1. 

The algorithm (13), (21)-(25), whose mode of operation is depicted in Fig. 1, will 
be referred to as adaptive SIORHC. It generates plant input increments as follows 



Rt{d)5u{t) = -S t (d)y(t) + v(t) + T]{t) 
v{t) := Z t {d)r{t + T) 



(9.1-26) 



where R t {d), S t (d), Z t (d), with R t (0) = 1 and dZ t (d) =T—1, are the polynomials 
corresponding to the SIORHC law (6) at time t. 

Remark 9.1-1 In the algorithm (13), (21)-(25), the plant parameter-vector 
estimate is updated every time-step, whereas the controller parameters are kept 
constant for N time-steps. This mode of operation is only adopted for keeping 
the analysis of the algorithm as simple as possible. However we point out that 
adaptive SIORHC can operate in a more efficient way — with no change in the 
conclusions of the subsequent convergence analysis — if the controller parameters 
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are updated at each time-step, except at the times where (23) holds. At such times 
the controller parameters are to be computed according to (24) and (25), and frozen 
for the subsequent N > An — 1 time steps. □ 

In order to establish which value to assign to v(k), it is convenient to introduce the 
following square Toeplitz matrix of dimension 2n whose entries are the feedforward 
input increments defined in (26) 



V(k) 



v(kN-2n+l) v{kN~2n) 
v\kN-2n+2) v{kN-2n+l) 



v(kN~4n + 2) 
v(kN-4n + l) 



(9.1-27) 



v(kN) v(kN-l) ■■■ v{kN-2n+l) 

Lemma 9.1-1. Assume that v(k) in (25) be not an eigenvalue ofV(k) 

u(k)Gsp[V{k)] 



(9.1-28) 



Then if (28) holds for infinitely many t = kN + 1, it follows that the estimate 0(t) 
converges to the true plant parameter vector 8* 



Km 0(t) = 9* 



(9.1-29) 



Remark 9.1-2 Eq. (28) indicates that the self-excitation signal must be chosen by 
taking into account the feedforward signal. The reason is that we have to consider 
the interaction between self-excitation and feedforward so as to avoid that the 



latter annihilates the effects of the first. 

Proof of Lemma 1 Eq. (13c) yields 

p- x {t) = X(t) [P _1 (t - 1) + x(t - l)x'(t - 1)] 
Thus, setting 5(t) := n'=l A (*)' 

since < X(t) < 1 by Lemma 8.6-1, it follows that 

t 

S-i^p-^t) > I(t) := P _1 (0) + <i ~ - !) 

i=l 

Being I(t) monotonically nondecreasing 

lim I(t) = lim I(kN) 



□ 



Let 
Then 



*(t) := [ x(t) 



x{t-N + 1) } eK"»> 



I(kN) = P _1 (0) + i(0)x'(0) + *(iJV)*'(iJV) 



Consequently, 

$(iN)§' (iN) > for infinitely many 



lim A min [/(t)] = oo 

t— »oo 



(9.1-30) 



Next we constructively show that, if v(t) is chosen so as to fulfill (28), the L.H.S. of (30) holds 
whenever (23) is satisfied for infinitely many t = kN + 1. 

Consider the following "partial state" representation of the plant: 

A(d)A(d)£(t) = Su(t) 

y(t) = B(d)Z(t) 



Let 



Z(t) := £ t '_2n+l e R 2 
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and 



vi(*):=[ (vi-n+iY (H-n+iY ]'« 2 



(9.1-31) 

Then, ip(t) = Vip\(t) with T a full row-rank matrix, and (fii(t) = SZ(t) where S is the Sylvester 
resultant matrix [Kai80] associated with the polynomials A(d)A(d) and B(d), viz. 

6i ••• fe„ 

6i ••• b n 



_6i 

«71 



1 «1 
1 



if A(d)A(d) = 1 + aid- 
nonsingular, 



1 «i • ■ • a„ 
-a n d n . Since x(t) = ip(t)/m(t), V is full row-rank, and by (7a) 5 is 

iN 



E Z(t)Z'(t)>0 

t=iJV-4n+2 

implies 

iN 

*(iJV)*'(iJV) = E x(t)x'(t) > 

i=(i-l)JV+l 

since AT > 4n — 1. Rewrite the control law (26) as follows 

V'(t)SZ(t)=v(t)+r)(t) 

where 

*'(*) := [ s (t) ■■■ s„_i(t) 1 ri(t) ••• r„_i(t) ] 
is the row— vector whose components arc the coefficients of Rt(d) and St(d). For 4 6 ((i — l)iV, iJV], 
*(t) = tf(iJV). Hence, for t = iN, ■ ■ ■ , iN - 2n + 1, 

*'(iJV)5A(t) = v'(t) + v(i)f'(t) 



A(t) 

v'(t) 
f'(t) 



[ Z(t) ••• Z(t-2n + l) ] 
[ ■■■ t)(t-2n + l) ] 

[ <5i >i jv_2n+l ■■■ 5t-2n+l,iAT-2n+l ] 



(9.1-32) 



, iN, we 
(9.1-33) 



Let to be a vector in R of unit norm. Then 

[f'(iN)SA(t)w] 2 = [V \t)w + u{i)f \t)w] 2 

Denoting the inner product by (■,•), the L.H.S. of the above equation equals 

[(A(t)w,S'V(iN))] 2 < ||.S'*(iAf)|| 2 ||A(t)«;|| 2 

where the uppcrbound follows by Schwarz inequality. Summing over t = i N — 2n + 1 , • 
get 

iN iN 

\\S'*(iN)f J2 H A «H| 2 > E [v'(t) W + v(i)f'(t)w] 2 

t=iN-2n+l t=iN-2n+l 

Let 

F := [ t(iN) ••• f (iJV - 2ra + 1) ]' 

By (32), F = [ e2 n +l • • • ei ] where ei denotes the i-th vector of the natural basis of R 2n . 
Hence, F = F' and F 2 = I. Then, the R.H.S. of (33) equals \\(uF + V)«ij| 2 = \\(vl + FV)w|| 2 
where V := [ v(iJV) • • • v(iN - 2n + 1) ] ' and FV = V(k) with V(k) as in (27). It follows 
that the choice (28) makes the R.H.S. of (33) positive. On the other hand, by using the symmetry 
of A(t), 

iN iN 2n— 1 

1 2 



E ii a «^ 



t = iN-2n + l 



E E WW-!)]' 

t=iN-2n+l j=0 
iN 

< 2n E W 2 ^)]' 
t=iN-4n+2 



Sect. 9. 1 Indirect Adaptive Predictive Control 



293 



Since < \\S'^(iN)\\ 2 < oo, it follows that 



J2 Z(t)Z'{t) > 

t = iN-An + 2 

oo. On the other hand, 

A mi „[/(t)] < s-H^x^ [p-\t)] = mx^lPit)]]- 1 < 

Hence, Hindoo 5(t) = and, by (17), Hindoo 0(t) = 9*. 



S(t) 



Tr[P(0)] 



ng 



Next lemma points out that, if the self-excitation signal equals (28), after a finite 
time the self-excitation mechanism turns off forever and, henceforth, the estimate 
is secured to be nonpathologic. 

Lemma 9.1-2. For the adaptive SIORHC algorithm applied to the plant (1) the 
following self-excitation stopping time property holds. Let the self-excitation take 
place as in (28). Then, for the adaptive SIORHC algorithm (13), (21)-(25), applied 
to the plant (1), (7), (8), there is a finite integer T\ such that E(8(t)) > <;, and 
hence n(t) = 0, for all t > T\ . 

Proof (By contradiction): Assume that no T\ exists with the stated property. Thus, there is an 
infinite subsequence {ti} such that S (6 (ti)) < <;. From Lemma 1 it follows that 8(t) converges to 
9*. Since <;<<;*, there is a finite Ti such that H(0(t)) > ? for every t > T\. This contradicts the 
assumption. 

We are now ready to prove global convergence of the adaptive control system. To 
this end, we recall that the adaptive controller generates 5u(t) as in (26) with 



R t (d) = Rt-^d) 
S t (d) = St_i(d) 

z t (d) = z t _i(d) 



t £ 



kN+l,(k + l)N 



From (13b) it follows that 

A t _ 1 (d)A(d)y(t) - B t -\{d)8u(t) + e{t) 



(9.1-34) 



with e(t) = m(t — l)s(t). Using (26) and (34), the following closed-loop system is 
obtained (d is omitted): 



SkN + l RkN+1 



y(t) 

8u{t) 



e(t) 





ZkN+l 



r(t + T) + 



nit) 



(9.1-35) 

where t £ (kN, (k + l)N]. By (14) the coefficients of the polynomials A t A and B t 
are bounded and the same is true for R t , St and Z t . Further, from (16) (A t -i — 
AkN+i) — * and (i?t-i — BkN+i) — ► 0, t £ (kN, (k + l)N], as t — > oo. Hence, as 
t — > oo, the (i-characteristic polynomial of the system (35) 

Xt{d) := At-i&RkN+i + B t -iS kN+ i 

= AkN + l^RkN + l + BkN+lSkN+1 + 

{At-i — AkN+i) ARkN+i + (Bt-i — BkN+i) SkN+i 



and 



Xk{d) ■— AkN+lARkN+l + BkN+lSkN+1 
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as k — > oo, behave in the same way. Now, by virtue of Lemma 2, the latter is strictly 
Hurwitz for all k such that kN + 1 > T\, T\ being a finite integer. Consequently, 
there exists a finite time such that for all subsequent times the (i-characteristic 
polynomial Xt(d) of the slowly time-varying system (35) is strictly Hurwitz. Then, 
it follows from Theorem A- 9 that (35) is exponentially stable. Consequently, by 
Lemma 2 being 77(f) = for all f > Ti and assuming that |r(f)| < M r < 00, the 
linear boundedncss condition (Cf. (8.3-10)) 

\\<Pi{t - 1)|| < ci + c 2 max \e(i)\ 
»e[i,t) 

holds for bounded nonnegative reals c\ and C2. Here (fi(t) denotes the vector in 
(31). 

By (18) we have also 

= lim e 2 (t) = lim — ^ 

[max {m, ||IVi(i- 1)|| }] 2 



t^oo m 2 + C3||<^i(f — 1)|| 2 

where L, as noted after (31), is such that ip(t) = Tipi(t). Hence, the last limit 
is zero. We can then apply the Key Technical Lemma (8.3-1) to conclude that 
{||<pi(f)||} is a bounded sequence and lim^oo e(t) = 0. In particular, boundedness 
of {||</?i(f)||} is equivalent to boundedness of {y(t)} and {Su(t)}. To show that 
{u(t)} is also bounded we use the following argument. By (7a) and (B-10) there 
exist polynomials X(d) and Y(d) satisfying the Bezout identity 

A{d)A(d)X(d) + B(d)Y(d) = 1 

Therefore 



u(t) = A(d)X(d)A(d)u(t) + B(d)Y(d)u(t) 

= A{d)X{d)5u{t)+Y{d)A{d)[y{t)-c] [{!)] 

Hence u(t) is expressed in terms of a linear combination of a finite number of terms 
from 5u l and y*, plus the constant term Y(d)A(d)c. Boundedness of {u(t)} thus 
follows for that of {Su(t)}, {y(t}} and c. 

The above results are summed up in the next theorem in which additional 
convergence properties of the adaptive system are also stated. 

Theorem 9.1-1. (Global convergence of adaptive SIORHC) Consider the 
adaptive SIORHC algorithm (13), (21)-(25), applied to the plant (1), (7), (8). Let 
the output reference sequence {?*(f)} be bounded and the self-excitation signal be 
chosen so as to fulfill (28). Then, the resulting adaptive system is globally conver- 
gent. Specifically: 

i. u(t) and y(t) are uniformly bounded; 

ii. The controller parameters self-tune to a stabilizing control law in such a way 
that after a finite number of steps E(0(t)) > <r and henceforth self-excitation 
turns off; 
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Hi. The multistep-ahead output prediction errors asymptotically vanish, viz. 

t !™ fo+T+n-l - Kt+T+n-l] = O^ 1 " 36 ^ 

w/iere 

W^T+n-l : = W(*)*«t+T+ n -2 + r(*)«(t) (9.1-36b) 

iv. The adaptive system is asymptotically offset-free, i.e. 

r(t) = r ==> lim y(t) = r and lim Su(t) = 0, (9.1-37) 

t — >oo t — >oo 

and yields asymptotic rejection of constant disturbances. 

Proof It remains to prove iii. and iv. As shown in [MZ91], (36) follows from the fact that 
limt^oo e(t) = and limt^ ||0(t) - 9(t - 1)|| = 0. As for (37), if r(t) = r from (35), (15) and 
taking into account that e(t) — > 0, r?(t) — > 0, Xt(cf) — > X<x>(<2) = ^oo(rf) A(d)Roo (d) + Boo (d)Soo (d) 
with Xoo(d) strictly Hurwitz, we conclude that Su(t) — > and 

,. Bqq(1)Zoo(1) 

t^°° yw A 00 (l)A(l)fl 00 (l) + Boo(l)Soo(l) 

^oo(l) 

= r = r 

Soc(l) 

where the last equality follows because Zt{l) = St (I) by the same arguments preceding Theorem 
5.8-2. 

Remark 9.1-3 The self-excitation condition (28) can be easily fulfilled whenever 
the matrix V(fc) is known at time kN — 2n + 1. In such a case, in fact, one has to 
check that v(k) is not one of the 2n roots of the characteristic polynomial of V(fc). 
Consequently, at most 2n + 1 attempts suffice to satisfy (28) with an arbitrarily 
small \v{k) \ . In particular, v{k) can be chosen to be zero whenever det V(fc) ^ 0. In 
fact, det V(fc) ^ can be interpreted as a condition of excitation over the interval 
[kN — An + 2, kN] caused by the command input ~ 4 "+ 2 . However, knowledge 
of V(k) at time kN — 2n + 1 implies knowledge of the reference up to time kN + T, 
viz. T + 2n steps in advance. This is, in fact, the case in some applications where 
the desired future output profile is known a few steps in advance. 

If V(fc) is unknown at time kN — 2n + 1 and {r(t)} is upperbounded by M r 

\r(t)\ < M r 

(28) can be guaranteed (Cf. Problem 1) by taking 

|i/(fc)| > 2nT 1 ! 2 M r \\Z kN {d)\\ > a(V(k)) (9.1-38) 

with \\Z kN (d)\\ 2 = Ya=o ( z kNs) 2 if Z kN (d) = Ya=o z kNsd l and a(V(k)) denotes 
the maximum singular value for V(k). Note that (38) is quite conservative w.r.t. 
(28). □ 

Problem 9.1-1 Prove that (28) is guaranteed if the self-excitation signal v{k) satisfies (38). 
[Hint: Show first that \v(k)\ > o(V(k)) suffices. Next prove that o(V(k)) is upperbounded as in 
(38). ] 

Problem 9.1-2 Modify the adaptive SIORHC algorithm so as to construct for the plant (1) 
a globally convergent adaptive pole-positioning regulator with self— excitation, whose underlying 
control problem consists of selecting the regulation law R(d)Su(t) = —S(d)y(t), the polynomials 
R(d) and S(d) solving the Diophantinc equation 

A(d)A(d)R(d) +B(d)S(d) = Q(d) 

with Q(d) strictly Hurwitz and such that 8Q(d) = 2n — 1. 
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Plant 




Figure 9.1-2: Plant with input and output bounded disturbances. 



9.1.2 The Bounded Disturbance Case 

Here, unlike (1), the plant is given by 

A(d)y(t) = B(d) [u(t) + uj u (t)] + A(d) [u v (t) + c] 



(9.1-39a) 



where the polynomials A(d) and B{d) as well as y(t), u(t) and c are as in (1). Fur- 
ther, u) u (t) and u) y {t) denote respectively input and output bounded disturbances 
such that 

\uj u (t)\ <n u <oo \u y {t)\ <tt y <oo (9.1-39b) 

with fl u and Q y two nonnegative real numbers. Fig. 2 depicts the situation. 
Similarly to (lc), here we find 



A(d)A(d)y(t) = B(d) [Su(t) + Sw u {t)] + A(d)A(d)cj y (t) 



(9.1-39c) 



We adopt again Assumption 1 so as to guarantee that the SIORHC law (6), con- 
structed from (39c) with 5u u (t) = u) y (t) = 0, stabilizes the plant. 

We go now into the details of constructing and analysing an adaptive predictive 
controller for the plant (39). The controller is obtained by combining a CT-NRLS 
with dead-zone and the SIORHC law in such a way to make the resulting adaptive 
control system globally convergent. 

I. Identification algorithm 

The identification algorithm is finalized to identify the polynomials A(d) and B(d) 
in the plant incremental model 



A(d)5y(t) = B(d)6u(t) + 5uj{t) 

5uo{t) := B(d)Suj u (t) + A(d)6u y (t) 
Note that for a nonnegative real number we have 

\Soj(t)\ <n<oo 

In fact 



\5w(t)\ <n~2n 



Q u max l&jl + fiy max |a$ 

l<i<n b l<i<n b 



(9.1-40a) 
(9.1-40b) 

(9.1-41a) 
(9.1-41b) 
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Using the same notations as in (10) we have 

Sy(t) = ip'(t - 1)0* + 5u(t) (9.1-42) 
or, in terms of normalized data, 

8j(t) = x'(t - 1)6* + 5u>(t) (9.1-43a) 

to(t) := -££L (9.1-43b) 
m{t — 1) 

We next consider the following identification algorithm. 
CT-NRLS with dead zone (DZ-CT-NRLS) 

This is the same as the RDZ-CT-NRLS algorithm (8.9-13) with one change. It 
consists of replacing the relative dead-zone (8.9-13f) with an "absolute" dead-zone. 
Specifically, the DZ-CT-NRLS algorithm is defined by (8.9-13a)-(8.9-13d) with 

«(*) = ( Ke { >ik) *!*(*)! >(i + *) 1/a n (9-1 . 44) 

[ otherwise 

The algorithm is initialized for any P(0) = P'(0) > and any 0(0) G R" e such 
that A (d)A(d) and B Q (d) are coprimc. 

The rational for the dead-zone facility (44) is suggested by an equation similar 
to (8.9-17) which can be obtained by adapting the solution of Problem 8.9-2 to 
the present context. The dead-zone mechanism (44) freezes the estimate whenever 
the absolute value of the prediction error becomes smaller than an upperbound for 
the disturbance 6u>(t) in (42). The properties of the above algorithm which will be 
used in the sequel are listed hereafter. 

Result 9.1-1. (DZ-CT-NRLS) Consider the DZ-CT-NRLS algorithm (8.9- 
13a)-(8.9-13e) and (44) along with the data generating mechanism (40) and (4-1). 
Then, the following properties hold: 

i. Uniform boundedness of the estimates 

||0(i)|| < M e < oo Vi e (9.1-45) 



ii. Vanishing normalized prediction error 



lim K (t)e 2 (t) = (9.1-46) 

t — >oo 



Hi. Slow asymptotic variations 



lim ||0(t) -0(t-fc)|| =0 Vfce^! (9.1-47) 

t — >oo 



Problem 9.1-3 Prove Result 1. [Hint: Adapt Problem 8.9-2 to the present case. ] 

For the same reasons discussed before (20a), we next proceed to adopt a self- 
excitation mechanism. To this end we define a syndrome according to which the 
controller detects its pathological condition as in (20) and (22). We now construct 
the remaining part of the adaptive controller. 
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II. Controller with Self— Excitation 

The tuning mechanism of the controller parameters is the same as in the ideal case: 
viz. (21)-(26). Cf. also Fig. 1. Remark 1 applies to the present context as well. The 
algorithm (8.9-13a)-(8.9-13d), (44), (21)-(26) will be referred to by the acronym 
DZ-CT-NRLS + SIORHC. 



III. Convergence Analysis 

We analyse the DZ-CT-NRLS + SIORHC algorithm applied to the plant (39). 
The following lemma replaces Lemma 1 in the present case. 

Lemma 9.1-3. For the DZ-CT-NRLS + SIORHC algorithm applied to the plant 
(39) the following property holds. Given any nonnegative 13, there is a nonnegative 
bounded real number v(k, (5) such that 



kN 



W(k)\>v(k,f3) 



<p(t)<p'(t) > P 2 In 

t=kN-in+2 



(9.1-48) 



where \v(k)\ denotes the intensity of the dither injected into the plant according to 
the self-excitation mechanism (21)-(26). Further, the implication (4-8) is fulfilled 
ifv{k,@) ^ chosen as follows 



£(r'k(«s?) 



'In "*fjffl 11 ft + a(V(k)) + a(V u (k)) 



(9.1-49a) 



where: 



fa ~0 + *(r) [n {An - 1) (fl 2 y + 4ft 



2\lV2 



(9.1-49b) 

iS denotes the Sylvester resultant matrix defined after (31); *S?(kN) is the vector 
*(jfeJV):=[ s {kN) ••• s n ^(kN) 1 n{kN) ••• r„_i(iUV) ]' (9.1-49c) 

whose components are the coefficients of Rkiy(d) and SkN(d) of the SIORHC law 
at the time kN ; T is the matrix defined after (31); V(k) is as in (27); 



V u {k) :-- 



C(kN-2n+l) C(kN-2n) 



C(kN) 



((kN-4n + 2) 
C(fc7V-2n+l) 



(9.1-49d) 



((kN-1) 
C(t) := V'(kN)tp u (t) 

with ip w (t) as in (53b)-(53d); anda(M) anda_{M) denote respectively the maximum 
and the minimum singular value of the matrix M. 

Proof Consider any unit norm vector w £ R 2 ™ -1 . Then, the R.H.S. of (48) is equivalent to the 
inequality 



/3 2 <w' 



kN 

t=feJV-4n+2 



kN 

i=feJV-4n+2 



U>1 



wi := T'w G R 2n 
In fact, as discussed in the proof of Lemma 1, 

¥>(*) = i>i(t) 



(9.1-50) 



(9.1-51) 



Sect. 9. 1 Indirect Adaptive Predictive Control 



299 



with 



¥>!(*)==[ (vLn+lY (K-n+l)' ]' 



and T a full row-rank matrix. 

Consider the following "partial state" representation of the plant (39c) 

A(d)A(d)£(t) = Su(t) + 5ui u {t) 

y(t) = B(d)Z(t)+u, y {t) 

Define next the partial-state vector 

z {t) : = ?t-2n+i e iR- 2n (9-1-52) 

Then, we find the following relationship between <pi(t) and Z(t) 

¥>i(t) = S2(t) + <p w (t) (9.1-53a) 

where 

<Pu{t) = f y {t) - ip u {t) (9.1-53b) 

and 

ip y (t) := [ ui y {t) ■■■ oj y {t-n + l) _■■ Oj' 



<p u {t) := \ 0, ■■■ , Su>„(t) ■■■ 5w u (t-n + l) V 



(9.1-53c) 
(9.1-53d) 



and, as in the proof of Lemma 1, S denotes the Sylvester resultant matrix associated with the 
polynomials A(d)A(d) and B(d). Substituting (53a) into (50) we get 

fcJV 

K^W+^i^W]^/? 2 (9.1-54) 

t = kN-4n + 2 

Now 

l2\V2 , ; 2A l/2] 2 



£ K^w + k^w] 2 > [(£ K^w] 2 ) 172 - km*)] 2 )' 

[^^(t)] 2 < |M| 2 ||^)II 2 = ||r'«,|| 2 ||^(t)i|' 



Further 

|| 2 ||^^" 2 
< u 2 (T)nnf 

where 



:= Q 2 + 4^ 2 (9.1-55) 

and <r(r') denotes the maximum singular value of V . We then conclude that (54), and hence 
(50), is satisfied provided that 

£ [w' 2 Z(t)] 2 > p\ (9.1-56a) 

Pi := P + [n(4n - 1)] 1/2 cr(r')fii (9.1-56b) 
w 2 := S'wi = S'F'w (9.1-56c) 
Rewrite the control law (26) as follows 

v(t)+ V (t) = *'(t)^i(t) 

= *'(t)[SZ(t) + v w (t)] [(53a)] 

where 

*'(*):=[ s (t) ■■■ Sn-l(t) 1 ri(t) ■■■ r n _i(t) ] 
is the row-vector whose components are the coefficient of Rt{d) and St(d). Then, 

V{t)SZ(t) = v(t) - *'(t)M*) + »?(*) 

Defining 

v'(t) := [ v(t) ■■■ v(t-2n + l) ] 
z'(t) := *'(fciV) [ M*) •■■ ^(t-2n + l) ] 
v'(t) := v'(t) - z(t) 
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and proceeding exactly as in the proof of Lemma 1 after (32), similarly to (33) we find 
kN kN 

||5'*(fc7V)|| 2 ||A(t)t«2|| 2 > E [v'(t)w 2 + u(k)f'(t)w 2 ] 2 

t = kN-2n + l t = kN-2n + l 

= || [i/(fc)J + V(k)j «j 2 | 2 (9.1-57a) 

where 

V{k):=V{k)-V u {k) (9.1-57b) 
with V{k) as in (27) and V w {k) as in (49d). Further, as after (33) 

kN kN 

£ Hawaii 2 <2« Yl Hmf 

t=kN-2n+l i=feJV-4n+2 
Combining this inequality with (57), we get 

| \u{k)I + V(fc)] w 2 || 2 



w' 2 [£Z(t)Z'(t)] w 2 



2n||5'*(feAr)|| 2 



Therefore, comparing the latter inequality with (56a), we see that (56a), and hence (50), is satisfied 
provided that 

J] [i/(fc)J + y(fc)] w 2 j > 2n||5'*(fcAf)|| 2 /3 2 
Now, provided that all the differences in the next inequalities arc nonncgativc, we have 

||i/(fc)tU2 + V(fc)w2| > W{k)\ \\w2W - a (v(kfj \\w2W 

> Hk)\a(S) a (r') - a (v(fc)) a (S) a (Y) 

> W(k)\a(S) a (V) - [a (V(k)) + a (V u (k))] 5 (S) a (T) 

Then, (50) is satisfied if 

, , MI . (2n)V 2 \\S>V(kN)\\ fh+a (S) a (T) [a (V(k)) + a (V„(k))] 

l"WI > a(S)a(T') (9 - 1_58) 

Problem 9.1-4 Consider (58). Show that 

a{V w {k)) < 2n 3 / 2 ||*(feAf)||f7i 
with Hi as in (55). Recalling (38), check that the R.H.S. of (58) can be upperbounded by 



P , /rr-o. , .A^W Z kN{d)\\ M 



/ill 



(j (r ) v ll*(wv)ll 

:= v 7 " ( V4ra - 1 + V2nj 



Next lemma points out that if the intensity of the self-excitation dither is high 
enough, after a finite time the self-excitation mechanism turns off forever and, 
henceforth, the estimate is secured to be nonpathologic. 

Lemma 9.1-4. For the DZ-CT-NRLS + SIORHC algorithm applied to the plant 
(39) the following self-excitation stopping time property holds. For large enough (3 
andv{k,f3) as in Lemma 3, there exists a finite integer Ti such that 

\v{k)\> v{k,(3) => 3(0(t))>?, Vt>T 1 (9.1-59) 

and hence r/(t) = 0, for every t > T\. 
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Proof Suppose by contradiction the S(0(t)) < <; infinitely often irrespective of how large (3 is 
chosen. Assume first that n(t) = k > infinitely often and {m(t)} unbounded. Then, by (46) and 
(41a) there is a subsequence {ti}°^L 1 along which Hindoo \\x (tj)|| = 1 and lim^oo 6 (t,) = 0„ e . 
The latter, together with (47), contradicts that S(0(t)) < ? infinitely often since S (9*) = ?* > ? 
by (20). Therefore, if {m(t)} is unbounded, a finite self— excitation stopping time must exist. To 
exhaust the possible alternatives, assume that infinitely often either n(t) = or n(t) = k > with 
{m(t)} bounded. In both cases, by (46), we find that after a finite time 

n 2 ■.= [(i + e) 1 ' 2 + i] n > = \<p'(t)e(kN) + <fi'(t) [e»(t) - e(fciv)] 

By (47) this implies that 



with 



\<p'(t)0(kN)\ < n 3 (t) 



lim Cl 3 (t) = Q 2 

t — >oo 



Squaring and summing both sides of the last inequality for t = kN — 4n + 2, • • • , fciV, (fc — 1)JV + 1 
being a time at which the syndrome turns on, we find 



kN 

t=fcAT-4n+2 



kN 

t=feJV-4n+2 



6(kN) 



> p 2 \d{kN)f [(48)] 



Hence 



where for k large enough f2|(fc) < {An— 1)H| + 5 2 with 5 2 > 0. Thus, for A; large enough, 0(fciV) 
can be made as small as we wish by choosing fi sufficiently large. As noted above, this contradicts 
the assumption that S(0(t)) > q infinitely often. 

Remark 9.1-4 Eq. (49) is an interesting expression in that it unveils how the 
different factors affect a lower bound for the required dither intensity v_{k, 0). First 
the bound depends on the plant via the ratio <j(S')/q_(S'), which can be regarded as 
a quantitative measure of the reachability of the plant state-space representation 
for the state <fi(t) in (31), and via the disturbance bounds and o (Kj(fc)). Second, 
the dependence on the controller action is explicit in ^f(kN) and implicit in V LU (k) 
and V(k), the latter accounting for the feedforward action. Note that if /3 = Cl u = 
fly = 0, (49) reduces to 



v(k,(3) 



^ (r Ww) 



which is a conservative version of the condition (28) valid for the ideal case. □ 

We are now ready to prove the main result for the adaptive system in the presence 
of bounded disturbances. 

Theorem 9.1-2. (Global Convergence of DZ-CT NRLS + SIORHC) 

Consider the DZ-CT-NRLS + SIORHC applied to the plant (39). Let the output 
reference {r(t)} be bounded and the self-excitation intensity be chosen, according to 
Lemma 4, large enough to guarantee a finite self-excitation stopping time. Then, 
the resulting adaptive system is globally convergent. Specifically: 

i. u(t) and y(t) are uniformly bounded; 



302 



Adaptive Predictive Control 



ii. After a finite time T 2 the parameter estimate equals 

6(00) ~ (4»(d),-Boo(d)) (9.1-61) 
and the controller self-tunes on a control law 

J2oo(d)<Mt) = Soo{d)y(t) + Zoo(d)r(t + T), Vt > T 2 (9.1-62) 
w/iic/i stabilizes the system and such that 

Xoo{d) := A 00 (d)A(d)fi 00 (d) + S^B^d) (9.1-63) 
is strictly Hurwitz; 
Hi. After the time T 2 i/ie prediction error remains inside the dead-zone 

\e{t)\ < (l + e) 1 / 2 ^, Vi>T 2 (9.1-64) 

to. // r(i) = r, for the adaptive system we have 

y(t)-r ^ ^§e(t) (9.1-65a) 
(t—oo) Xoo(a) 

*«(*) ^7^ £ W (9.1-65b) 

Proof 

i. We proceed along the same lines as after Lemma 2 to find that for all t > T3, T3 being a 
finite time greater than T\ in Lemma 4, the following linear boundcdness condition 

\\<Pl(t - 1)11 < ci + c 2 max |e(i)| (9.1-66) 
ie[i,t) 

holds for bounded nonnegative reals ci and c 2 . Now if {s(t)} is bounded, from the above 
inequality it follows that {¥>l(*)} is also bounded. Suppose on the contrary that {e(t)} 
is unbounded. Then, there is a subsequence {U} along which limt^oo |e(<i)| = 00 and 
|e(t)| < |e(ti)| for t < t t . Further, k(U) = re > 0. Thus 

|£(ti)J 
m(tj - 1) 



V 7 it (U) \e (tj)| = 



^ ^ ^,' £(< ;!' ttt] [(5i)] 

m+ |jripi (ti - 1)11 
c 3 + c 4 |e 

This implies that 

K l/2 

lim y/K(ti) \e(ti)\ > > 

which contradicts (46). Then, {<pi(t)} is bounded. This is equivalent to boundedness of 
{y(t)} and {Su(t)}. To prove boundcdness of {u(t)} we can use the same Bezout identity 
argument as before Theorem 1. 

- iii. Since by i. {<p(t)} is bounded, (46) yields 

lim K(t)e 2 (t) = 

t— »oo 

Hence, 

lim sup|e(t)| < (l + e) 1/2 U 

t — *oo 

which implies (64) and, hence, ii. because of the finite self-excitation stopping time. 
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iv. Using (62), that Z t (l) = St(l) and the fact that for t > T 2 

A 00 (d)A(d)j/(t) = B 00 (d)5tt(t) + e(t) 

(65) follows. 

It is difficult to find a sharp estimate of self-excitation intensity \v(k)\ which can 
guarantee the condition (59). On the other hand, even a conservative estimate 
of this intensity, such as that in (49) and (60), would depend in practice on a 
priori unknown parameters (ratio between the maximum and minimum singular 
value of the transpose of the Sylvester resultant matrix associated to the true 
plant, how small \\9{t)\\ must be in order to guarantee that E(6(t)) > etc.). 
Therefore, the practical relevance of Theorem 2 is to indicate that the combination 
of a high intensity self-excitation dither with a CT-NRLS with a relative dead- 
zone can make the adaptive system capable of self-tuning on a stable behaviour in 
the presence bounded disturbances. 



9.1.3 The Neglected Dynamics Case 

In this subsection we discuss qualitatively some frequency-domain and related fil- 
tering ideas which turn out to be important in the neglected dynamics case. The dis- 
cussion parallels the one on the same subject presented in the first part of Sect. 8.9 
for STCC. Here we extend the ideas to adaptive SIORHC. However, we shall re- 
frain from embarking on elaborating any globally convergent adaptive multistep 
predictive controller for the neglected dynamics case. This is in fact an issue in the 
realm of current research endeavour. For some results on this point see [CMS91]. 

We assume that the plant to be controlled is again given by (8.9-1). Here, 
however, (8.9-1) is modified as follows 

A°(d)ll(d)y(t) = B°(d)6u(t) + A°(d)U(d)uj(t) (9.1-67a) 

where 

Su(t) := U(d)u(t) (9.1-67b) 

In this way, the presence of the common divisor TL(d) of A°(d) and B°{d) as in 
(8.9-2) is ruled out. This is important since, unlike Cheap Control, SIORHC de- 
sign equations cannot be easily managed in the presence of the above common 
divisor and, more importantly, stability of the controlled system is not guaranteed. 
Notice that (67b) generalizes our usual notational convention of denoting by Su(t) 
simply an input increment. The reader should realize before proceeding any fur- 
ther that SIORHC with no formal changes is fully compatible with (67), in that its 
terminal input constraints arc still meaningful for the notion (67b) of generalized 
input increments. From (67) we obtain the reduced-order model (Cf. (8.9-3)) 

A(d)U(d)y(t) = B(d)8u(t) + n(t) (9.1-68a) 

A°(d) = A(d)A u (d) B°{d) = B(d)B u (d) (9.1-68b) 

and 

»(*) = m[B y {d ] AU{d)] W) + MdMdMt) (9.1-68C) 

In order to possibly identifying a reduced-order model which adequately fits the 
plant within the useful frequency-band, we proceed in accordance with the guide- 
lines given in Sect. 8.9. Here, we select a low-pass stable and stably-invertible 
transfer function L(d) which rolls-off beyond the useful frequency-band. 
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By prcfiltcring with L(d) the plant I/O variables we arrive at the following 
model 

A(d)U(d)y L (t) = B(d)6u L (t) + n L (t) (9.1-69a) 

y L (t) := L(d)y(t) 5u L {t) := L(d)Su(t) (9.1-69b) 

and riL(t) := L(t)n(t). This is formally the same as (8.9-4). There is, however 
a difference between the two models in that the route that we have now followed 
to arrive at (69) has been deliberately finalized to avoid the introduction of H(d) 
as a common divisor of A(d)H(d) and B(d). The polynomials to be identified are 
A(d) and B(d) via the use of the filtered variables 5yL{t) ■= H(d)yL{t) and SuL(t). 
These are related by the system 

A(d)Sy L (t) = B{d)5u L {t) + n L {t) (9.1-70) 

We next focus on how to robustify the control system by introducing suitable 
dynamic weights in the underlying control problem. This is done by adopting a 
procedure similar to the one in (8.9-5)-(8.9-8). Specifically, we consider the plant 
I/O filtered variables 

y H (t) := H{d)y{t) 8u H (t) := H{d)5u{t) (9.1-71a) 

with H{d) a monic high -pass strictly Hurwitz polynomial, and the model 

A(d)H(d)y H (t) = B(d)Su H (t) + H(d)n(t) (9.1-71b) 

where n(t) is assumed to be a zero mean white noise. Then, we compute the 
SIORHC law related to the cost 

1 t+T-1 

- £{e 2 y (k+l) + * u 5u 2 H (k) \y\5u t - 1 } (9.1-72a) 
k=t 

s y {k) := y H (k) - H{l)r{k) (9.1-72b) 

and the constraints 

Su H (k) = t + T <k<t + T + n-l ) 

£ {vH{k) | y\ Su*' 1 } = H(l)r(t + T + 1) t + T + I < k < t + T + n } 

(9.1-72c) 

Here, T > n, ii being the McMillan degree of B(d)/A(d). For the necessary details 
on the above SIORHC law we refer the reader to Sect. 7.7. The rational for intro- 
ducing the above filtered variables is similar to the one discussed in (8.9-5)-(8.9-8). 
See also (7.5-29)-(7.5-32). 

In the above considerations we have pointed out that, in contrast with the 
ideal case, in the neglected dynamics case it is essential to identify a reduced- 
order model using low-pass prefiltered I/O variables. In particular, identification 
of an incremental model as (lc) with no prcfiltcring is by all means unadvisable, in 
that the A(d) polynomial enhances the high frequency components of the equation 
error. The second important point is that, in order to robustify the controlled 
system, it is advisable that the control design be carried out relatively to high-pass 
filtered I/O variables as in (71) and (72). 

Main points of the section By using a self-excitation mechanism finalized 
to avoid possible singularities, a globally convergent adaptive SIORHC algorithm 
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based on the CT-NRLS can be constructed. The constant trace feature of the 
estimator makes adaptive SIORHC suitable for slowly time-varying plants. In 
order to choose the self-excitation signal, the interaction between self-excitation 
and feedforward must be considered. If this is properly done, for a time-invariant 
plant the self-excitation turns off forever after a finite time. 

In the ideal case, the intensity of the dither due to self-excitation can be cho- 
sen vanishingly small. In the case of bounded disturbances, where the CT-NRLS 
identifier is equipped with a dead-zone facility, the dither intensity is required to 
be high enough so as to force the final plant estimated parameters to be nonpatho- 
logic. Low-pass filtering for identification and high-pass filtering for control are 
often procedures of paramount importance to favour successful operation in the 
presence of neglected dynamics. 



9.2 Implicit Multistep Prediction Models of Lin- 
ear—Regression Type 

In Sect. 8.7 we showed that the output prediction of a MV-controlled 
CARMA plant can be described in terms of a linear-regression model. This fact 
was exploited to construct an implicit stochastic ST controller based on the MV 
control law. The aim of this section is to show that a similar property holds also for 
more general control laws. This is of interest in that it allows us to construct im- 
plicit adaptive controllers for CARMA plants with underlying control laws of wider 
applicability than MV control. In this respect, particular attention will be devoted 
to underlying long-range predictive control laws. As will be seen, the resulting 
implicit adaptive predictive controllers exhibit advantages and disadvantages over 
the explicit ones. One disadvantage is that there is no available proof of a globally 
convergent implicit adaptive predictive control scheme. The only possible excep- 
tion to this is [Loz89] which is however solely focused on the adaptive stabilization 
problem with no performance-related goal. On the positive side, implicit adaptive 
predictive controllers can exhibit excellent local self-optimizing properties in the 
presence of neglected dynamics. This makes them attractive for autotuning simple 
controllers of highly complex plants. 

The starting point of our study is the SISO CARMA plant 

A(d)y(t) = B(d)u(t) + C(d)e(t) 

with: A(0) = C(0) = 1; 

• n := m&x{dA(d), dB(d), dC(d)}; (9.2-lb) 

• A(d), B(d), C(d) have unit gcd; (9.2-lc) 

• C(d) is strictly Hurwitz; (9.2-ld) 

• A(d) and B(d) have strictly Hurwitz gcd. (9.2-le) 

Further the innovations process e is zero-mean wide-sense stationary white with 
variance 

al:=£{e 2 (t)}>{) (9.2-lf) 

and such that 

£ {u{t)e{t + i)} = , feS,ieSi (9.2-lg) 



(9.2-la) 
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Irrespective of the actual plant I/O delay r = ord B(d), we can follow similar lines 
to the ones which led us to (7.3-9) so as to find for k € TL\ 

y(t + k )= Qk ^ d) u(t +k) + ^-V(t) + Q k (d)e(t + k) (9.2-2) 

where (Qk(d),Gk(d)) is the minimum degree solution w.r.t. Qk(d) of the following 
Diophantine equation 



C{d) = A{d)Q k {d) + d k G k {d) 

dQ k (d) <k-i 



(9.2-3) 



The problem that we wish to study next is to possibly find conditions on the "past" 
input sequence u t_1 under which (2) simplifies to the form 

y(t + k)= W!u(t + k - 1) + • • • + w k u(t) + S k s(t) + Q k (d)e(t + k) (9.2-4) 

for every possible "future" input sequence u^ tt+k y In (4) s(t) denotes a vector with 
a finite number of components from y l and u t_1 , and S k a row-vector of compatible 
dimension. Further, because of the degree constraint in (3), Q k (d)e(t+k) is a linear 
combination of future innovations samples in e\t+i.t+k]- The difference between (4) 
and (2) is that the latter, due to the presence of C(d) at the denominator, involves 
an infinite number of terms from y l and Using a terminology similar to 

that adopted in Sect. 8.7, we shall call (4) an implicit prediction model of linear- 
regression type. 

To solve the problem stated above, we first find the minimum degree solution 
(W k (d), L k (d)) w.r.t. W k (d) of the following Diophantine equation 



Q k {d)B{d) = C{d)W k {d) + d k+1 L k (d) 
dW k {d) < k 



(9.2-5) 



Using (5) into (2), we get 



y(t + k) = W k (d)u(t + k) + §j^u(* - 1) + 7^»(*) + Qk(d)e(t + k) (9.2-6) 



Note that from (3) 

C(d)B(d) = A{d)Q k (d)B(d) +d k G k (d)B(d) 
= C(d)A(d)W k (d) + d k+1 



Then, it follows that the coefficients of W k (d) coincide with the first k terms of the 
long division of B(d)/A(d), viz. 

W k (d) = Wl d+--- + w k d k (9.2-7) 

where Wi, i — 1, • • • , k, are the first k samples of the impulse response associated 
with the transfer function B(d)/A(d). 

We see that, in order to rewrite (6) in the form (4), two polynomials U k (d) and 
T k (d) must exist such as to satisfy 

W) u{t + = Uk{d)u{t + Tk(dMt) (9 - 2_8) 
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Figure 9.2-1: Visualization of the constraint (9a). 



where the equality has to be intended in a mean-square sense. For an arbitrary 
stochastic input process u the above equation is not solvable w.r.t. Uk(d) and 
T k (d). On the other hand, in a regulation problem we are only interested in input 
sequences generated up to time t — 1 by a time-invariant nonanticipative linear 
feedback compensator of the form 

R(d)u(i) = -S(d)y(i) , t-n<i<t-l (9.2-9a) 

where R(d) and S(d) are polynomials with R(0) ^ and such that 



dR{d) = n , dS(d) = n - 1 
R(d) and S(d) coprimc 



(9.2-9b) 



The lower-bound (-non the time index i in (9a) indicates that the stated regu- 
lation law need not be used before the time t — n (see Fig. 1). We point out that 
the degree assumptions on R(d) and S(d) are consistent with both steady-state 
LQ stochastic regulation (Cf. Problem 7.3-16) and stochastic predictive regulation 
(Cf. Problem 7.7-3). Let us multiply each term of (6) by C(d)R(d) to get 

C(d)R(d)y(t + k) = C(d)R(d)[W k (d)u(t + k) + Q k (d)e(t + k)] + 
L k (d)R(d)u(t - 1) + G k (d)R(d)y(t) 

Since dL k (d) < n — 1, the third additive term on the R.H.S. of the last equation 
only involves input variables comprised in (9a). Thus, 

L k (d)R(d)u(t - 1) + G k (d)R(d)y(t) = (9.2-10) 

= [R(d)G k (d) - dS(d)L k (d)]y(t) 

In order to fulfill (8), this quantity must coincide in a mean-square sense with 

C(d) [U k {d)R{d)u{t - 1) + T k {d)R{d)y{t)} = (9.2-11) 

= C(d) [R(d)T k (d) - dS(d)U k (d)} y(t) 

where the equality follows from (9a) provided that dU k (d) <n — l. Since by (la), 
(If) and (lg), the process y contains an additive white component with nonzero 
variance, (10) equals (11) in a mean-square sense if and only if the following Dio- 
phantinc equation 



C(d) \R(d)T k (d) - dS(d)U k (d)} = R(d)G k (d) - dS(d)L k (d) 



(9.2-12) 
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admits a polynomial solution Uk(d) and T k (d). By coprimeness of R(d) and S(d), 
solvability of (12) is equivalent to require that the R.H.S. of (12) be divided by 
C(d) 

C(d) | [R(d)G k (d) - dS(d)L k (d)} (9.2-13) 
Now, using (3) and (5) we find 

dS(d)L k{ d) - R(d)G k (d) = g^Mg - C -^M + (9.2-14) 
where 

Xd(d) := A(d)R(d) + B(d)S(d) 
Therefore, (13) is satisfied provided that 

C(d) | xd(d) (9.2-15) 

Further, by degree considerations, we see that, under (15), the minimum degree 
solution of (12) is such that 

dU k {d) < n- 1 dT k {d)<n-l 

We sum up the above results in the following theorem 

Theorem 9.2-1. Assume that the CARMA plant (1) be fed over the time interval 
[t — n,t — 1] by the linear feedback compensator (9). Then, irrespective of u^ tt+k ^, 
the following implicit prediction models of linear-regression type 

y(t + k) = tom(i + fc - 1) H \-w k u(t)+ (9.2-16a) 

S k s{t) + Q k {d)e{t+k) VfceSj 

*(*)==[ (yU+i)' K=i)']' ( 9 - 2 - 16b ) 

hold provided that 

C(d) | [A(d)R(d) + B(d)S(d)} (9.2-17) 
The extension of Theorem 1 to 2-DOF controllers is given by the next problem. 

Problem 9.2-1 Assume that for i g [t — n,t — 1] the inputs to the plant of Proposition 1 arc 
in accordance with the following difference equation 

R(d)u(i) = -S(d)y(i) + C(d)v(i) (9.2-18) 

where R(d) and S(d) are as in (9) and satisfy (17), and v denotes an exogenous random sequence 
possibly related to a reference to be tracked by the plant output (Cf. (7.5-11)). Then, show that 
(16a) still holds provided that s(t) be refined as follows 

s(t):=[ (yt n+1 )' {4-1)' ]' 0.2-19) 

Remark 9.2-1 The vector s{t) in (16b) or (19) will be referred to as the pseu- 
dostate since, under the stated past input conditions, s(t) it is a sufficient statistics 
to predict y(t+k) in a MMSE sense on the basis of y*, u t+fe_1 (Cf. Theorem 7.3-1). 
□ 
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Remark 9.2-2 Conditions (17)— (19) have the following state-space interpreta- 
tion (Cf. (7.5-4) and successive considerations). Eqs. (17)— (19) are equivalent to 
assuming that the control action on the plant over the time interval i G [t — n, t — i] 
is given by 

u(i) = Tx(i | i) + v(i) (9.2-20) 

the R.H.S. of (20) being the sum oiv(i) with a constant feedback from the steady- 
state Kalman filtered estimate x(i | i) of a plant state x(i). □ 



Main points of the section CARMA plants admit implicit multistep prediction 
models of linear-regression type, provided that their inputs over a finite past are 
given by feedback compensation from the steady-state Kalman filtered estimate of 
a plant state. 

Problem 9.2-2 [MZ89c] Consider (16a) where s(t) is possibly given by (19). Let 
A{d) ~ B(d) ~ 

cw = g Qid and cw=g ftd 

Show that 

v>i = fa - aiWi-i ai-iwi 

ei(t + i) := Qi(d)e(t + i) 

= e(t + i) - ai€i-i(t +i - 1) a i - 1 e 1 (t + 1) 

with 

wi=/3i and ei(t + 1) = e(t + 1). 



9.3 Use of Implicit Prediction Models in Adaptive 
Predictive Control 

The interest in the multistep implicit prediction models (2-16)-(2-19) is that they 
can be exploited in adaptive predictive control schemes as if the pseudostate s(t) 
were a true plant state, irrespective of the innovations polynomial C(d). However, 
unlike the implicit linear-regression model of Sect. 8.7 where the prediction step k 
equals the plant I/O delay r, the parameters of the multistep implicit prediction 
models (2-16)-(2-19) cannot be directly identified via recursive linear regression 
algorithms since, for k > 1, Qk(d)e(t+ k) can be correlated with w*^ fe _ 1 . Never- 
theless, defining new "observations" by 

z t (t + k) := y(t + k) — W\u(t + k — 1) — • • • — Wk-iu(t + 1) 

= w k u{t) + S k s{t) + v k {t + k) [(2-16a)] (9.3-1) 

we find that the equation error 

v k {t+k) := Q k (d)e(t + k) 

by (2-lg) is uncorrelated with the regressor tp(t) := [ u(t) s'(t) ] . Thus, we can 
introduce the following identification scheme where any linear regression algorithm 
could replace the RLS algorithm which we shall refer to hereafter. 
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¥>(*) = [u(i) s'(t) ]' 




y(t + i) 



Figure 9.3-1: Signal flow in the interlaced identification scheme. 



Interlaced Identification Scheme 

At each time step t, and for each k = 1, 2, • • • , T + n — 1: 

i. Compute 

z t (t+k) := y(t+k)- (9.3-2) 
wi(t - 1)m(* + k-l) w k -i(t - + 1) 

where Wi(t — l), i < k — 1 , is the RLS estimate of Wi based on the regressors 
pt-i = | u t-i s t-i|. 

ii. Update the RLS estimates Wk{t — 1) and Sk(t — 1) so as to get Wfc(f) and 
Sfc (t) based on the new observation z t (t + k) and the model 

z t (t + k)= w k u(t) + S k s(t) + v k {t + k) (9.3-3) 

iii. Cycle through i. and ii. 

Fig. 1 depicts the signal flows in the interlaced identification scheme for T+n—l = 3. 
The upper arrow refer to the regressor [ u(t) s'(t) ] , while the regressand z t (t+k) 
is indicated at the bottom of each individual RLS identifier. 

Remark 9.3-1 As shown in Fig. 1, the above identification scheme comprises 
T + n—l separate RLS estimators all using a common regressor and hence a 
common updating gain. This feature moderates the overall computational burden 
of the interlaced identification scheme. □ 

In order to ascertain the potential consistency properties of the interlaced esti- 
mation scheme, let us assume that, under a constant and stabilizing control law 
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allowing implicit models, the estimates of the parameters in (3) converge to w k and 
S k . Moreover, assume that Wj = Wj, Vj = 1, 2, • • • , k— 1, hence Zt(t + k) — z t (t + k). 
Thus, under stationariety and ergodicity of the involved processes, Theorem 6.4-2 
shows that the following orthogonality condition between estimation residuals and 
regressor must be satisfied for the fc-th estimator 

= 5 {[«(*) s'(t) ]' (zt(t + k)-w k u(t)-S k a(t})} 
= £{[u(t) s'{t) ]' (w k u(t)- S k s(t)+v k (t + k)^} 
= s{[u(t) s'(t) ]' (w k u(t) - S k s(t))} (9.3-4) 

where w k := w k — w k , S k :— S k — S k , and the last equality follows since 

S{[ u(t) s'{t) ]'v k (t + k)} = 

Note that, according to (2-18), u(t) — Fs(t) + v(t). Assume that v(t) has a com- 
ponent with nonzero variance and uncorrclatcd with s(t). Then from (4) it follows 
that w k = w k . Hence, Wj = Wi, Vj = 1,2, ■■■,k — 1 implies w k — w k . Since 
z t (t + 1) = z t (t — 1), w\ = w\. Therefore, by induction, the estimates of the 
w k s are potentially consistent. Moreover, from (4) we also get £ {s(t)s'(t)} S' k = 
which yields S k = S k if * s := £ {s{t)s'{t)} > 0. 



Implicit Adaptive SIORHC 

We can exploit the above interlaced identification scheme to construct an implicit 
adaptive SIORHC algorithm for the CARIMA plant 

A(d)A(d)y(t) = B(d)Su(t) + C(d)e(t) (9.3-5) 

having all the properties as in (2-1) once A(d) is changed into A(d)A(d). In this 
case n = ma,x{dA(d) + l,dB(d),dC(d)} and the pseudostate s(t) is as in (2-19) 
with u\z\ replaced by SulZn- If we consider SIORHC as the underlying control 
problem, by Enforced Certainty Equivalence, we can set the control variable to the 
plant input at time t := t + T + n — 1 to be given by 

R t (d)Su(r) = -S t (d)»(r)+«(r)+»7(r) 1 (Q 3 _ g) 

v(t) = Z t (d)r(T + T) J 

Here Rt(d), St(d), Z t (d), with i?t(0) = 1 and dZ t (d) = T — 1 are the polynomials 
corresponding to the SIORHC law (1-6) computed by using the RLS estimates 
w k (t) and S k (t) from the interlaced identification scheme. 

The last term ry(r) in the first equation of (6) is an additive dither, viz. an 
exogenous variable introduced so as to second parameter identifiability. E.g., n can 
be a zero-mean w.s. stationary white random sequence uncorrelated with both e 
and r, and with variance <r^ > smaller than a\. To sum up, once the pseudostate 
s(t) is formed and the interlaced estimation scheme used, we can use the RLS 
estimates w k (t) and S k (t) to compute Su(t) via (1-6) as if the C(d)-innovations 
polynomial were equal to one. 
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(9.3-7c) 



Implicit Adaptive TCI: MUSMAR 

A noticeable simplification in the use of implicit models in adaptive predictive 
control can be achieved by adopting underlying control problems compatible with 
constraints on the future inputs U(t,t+T) m addition to the constraints (9) on the 
past inputs u^ t _ n t y We shall describe this by focussing on the pure regulation 
problem. Consider then again the CARMA plant (2.1). Let 

n := max {dA(d), dB(d), dC(d)} 

and 

-(*) : = [ (yt-n+i)' K:i)' 

Assume also that 

Rp(d)u(i) = -S p (d)y(i) t-n<i<t-l (9.3- 7a) 

or 

u(i) = F^s(i) t-n<i<t-l (9.3-7b) 

where 

dR p (d) — n, dS p (d) = n — 1 
R p (d) and S p (d) coprime 

the subscript p is appended to quantities related to the "past" regulation law and 
F p a row-vector whose components are the coefficients of the polynomials R p (d) 
and S p (d). Assuming that 

C(d) | A(d)Rp(d) + B(d)S p (d) (9.3-8) 

from Theorem 2-1 it follows that 

y(t + k) = wm(t + k - 1) H \-w k u(t)+ (9.3-9) 

S k s{t) + Q k {d)e{t + k) Vfce^i 

Suppose now that in addition to the "past" constraints, the following constraints 
are adopted for the "future" regulation law 

R f (d)u(i) = -Sf(d)y(i) t+1 <i <t + T -1 (9.3-10a) 

or 

u{i) = F'fs{i) t+l<i<t + T-l (9.3-10b) 

The situation related to (7) and (9) is depicted in Fig. 2. There it is shown that 
while both the past and the future inputs are generated via constant feedback laws 
the input at the "current" time t is unconstrained. Fig. 2 should be compared with 
Fig. 2-1 in order to visualize the additional constraints that we are now adopting. 
We now go back to (9) to write successively 

y(t + l) = iom(i) + Sis(t) + e(t + 1) 

u(t+l) = F' f s(t+1) 

= H2V,{t) + A' 2 s(t) +v 2 {t+ 1) 



y(t + 2) = Wl u{t + 1) + w 2 u{t) + S 2 s{t) + Q 2 {d)e{t + 2) 
= 9 2 u(t) + T' 2 s(t) + e 2 (t + 2) 
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t — n 



t-l 



t + T-1 



time 
steps 



inputs 
all given by a 
constant feedback 



inputs 
all given by a 
constant feedback 



unconstrained 
input 



Figure 9.3-2: Visualization of the constraints (7) and (10). 



Note that f2(t+ 1) is proportional to e(t+ 1), while e 2 (t + 2) is a linear combination 
of e(t + 1) and e(t + 2). By induction, we can thus prove the following result. 
Theorem 9.3-1. Consider the CARMA plant (2-1). Let its past inputs u\t- n ,t) 
satisfy (7), and its future inputs Uu it+T \ be given as in (10). Then, irrespective 
of u(t), the following implicit prediction models of linear-regression type hold for 
l<i<T 

y(t + i) = 6 iU (t) + T' iS {t) + ei(t + i) (9.3-lla) 
u(t + i-\)= mu{t) + A-s(t) + Ui(t + (9.3-llb) 

where 

A*i = 1 and Ai = (9.3-llc) 

£{e t {t + i) [ u(t) s'{t) ]} = £{v i+1 (t + i) [ u(t) s'(t)]} = Q (9.3-lld) 

8i and ^ depend on the future regulation law, and T ■ and A ■ depend on both the 
past and future regulation laws. 

The implicit prediction models (11) make it easy to solve the following problem. 
Under the validity conditions of Theorem 1, find the input variable 

u(t) e [s(t)] := Span{s(i)} (9.3-12) 

minimizing the performance index given by the conditional expectation 

C T = £ {Jt I s(t)} (9.3-13a) 



1 T 

jt-^y, [ y2 (* + o + p u2 ( f + * - !)] 

i=l 



(9.3-13b) 



In (12) [s(t)] denotes the subspace of all random variables given by linear com- 
binations of the components of s(t) (Cf. (6.1-20)). Recalling (D-l) and using the 
notation introduced in (6.4-10a), we get 

£ {y 2 (t + i) | s(t)} - f t (t + i)+£ {f(t + i) | s(t)} 



314 



Adaptive Predictive Control 



Vt(t + i) ■= £{y{t + i)\s(t)} 

= eiu{t) + T' iS {t) [(11a)] 

ytit + i) := y(t + i) - y t (t + t) 

= e t {t + i) [(11a)] 

£ {u 2 (t + i- 1) | s(t)} = u 2 t (t + i- 1) + £{u 2 (t + i - 1) | s(t)} 



u t (t + i-l) := £{u(t + i-l) | s(t)} 

= mu(t) + A' iS {t) [(Hb)] 
u(t + i-l) := u(t + i-l)-u(t + i-l) 

= Vi(t+i-l) [(lib)] 

Since £ {y 2 (t + i) \ s(t)} and £ {u 2 (t + i — 1) | s(t)} are unaffected by s(t), we find 
for the optimal input at time t 

u{t) = F's{t) (9.3-14a) 



F = -~- 1 J2(^ t + pti l A l ) (9.3-14b) 



i=l 
T 



S : = E + PHi) (9.3-14c) 



i=i 



Note that, by virtue of (11c), (14) are well defined whenever p > 0. 



Remark 9.3-2 



• The reader should compare (14) with (5.7-28) and (5.7-29), and more gener- 
ally the problem we have tackled above with the Truncated Cost Iterations 
(TCI) in Chapter 5, so as to convince himself that (14) represent the stochas- 
tic counterpart of TCI. 

• An important point not to be overlooked is that the feedback-gain vector F in 
(14) depends on the past and future regulation laws as specified by Theorem 
1. 

• Eq. (14) are the SISO version of (5.7-28) and (5.7-29). We can conjecture 
that (14) can be extended to cover the MIMO case. This is in fact true, as 
shown in the next section. There we shall see that the MIMO extension of 
(14) is mutatis mutandis formally the same as (5.7-28) and (5.7-29). 

□ 



The above discussion suggests a way to construct an implicit adaptive regulator 
whose underlying regulation law consists of TCI. We call such an adaptive regulator 
MUSMAR (MUltiStcp Multivariate Adaptive Regulator) by the acronym under 
which it was first referred to in the literature [MM80] . 
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MUSMAR (SISO Version) 

Assume that the plant has been fed by inputs u(k) = F'(k)s(k), k e 7L + , up to 
time t—1. Let u(t — 1) = F'(t - l)s(t — 1) with F(t — 1) based on estimates 



Qi(t-i) : = [ m-i) 0^-1) y 

M t (t-l) := [ A^-l) /ii(t-l) ]' 

for i= 1,2,---,T with Mi(i-l) = [ 0' nt 1 ]'. 
i. Update the estimates via the RLS algorithm: 

e 4 (t) = ei(t-i) + p(t-T + i)y)(t-T)x 

t/^-T + i)-^ -7)6^-1) 



Mi(t) = Mi(t-l)+P(t-T+l)<p(t-T)x 

u(t — T + i — — ip'(t - T)Mi(t - 1) 

<p{t-T) := [ s'(t-T) u(t-T) Y 
P _1 (t - T+ 1) = P- 1 ^ - T) + <p{t - T)iff{t - T) 
with P-^I - T) = P- T (l — T) > 0. 

ii. Compute the next input 



with 



u(t) = F'{t)s{t) 

T 

Fit) = s-^t) mt)r t (t) + p^ut)] 
i=i 

T 

S(t) = X)[^(t) + p/i?(t)] 



(9.3-15a) 
(9.3-15b) 

(9. 3- 16a) 

(9.3-16b) 

(9. 3- 16c) 
(9.3-16d) 

(9.3-17a) 
(9.3-17b) 
(9. 3- 17c) 



i=l 



hi. Cycle through i. and ii. 

Fig. 3 shows the time-steps corresponding to the regressor tp(t — T) and the regres- 
sands y(t — T + i) and u(t — T + i — 1), when the input to be computed is u(t). 
Fig. 4 depicts the signal flows in the RLS identifiers of MUSMAR when T = 3. 

Remark 9.3-3 

• As shown in Fig. 4, the MUSMAR identifiers are made up by T separate RLS 
estimators all using a common regressor and hence a common updating gain. 
This feature moderates the overall computational burden. Notice that, in 
contrast with the scheme of Fig. 1, MUSMAR identifiers have no interlacing. 

• Notice that we estimate the Mj(i). These however could be alternatively 
computed from Qj(t) and F(t — T + j), j = 1, • • • ,i — 1. Though it looks 
hazardous, the former alternative is suggested for the related positive working 
experience and its lower computational burden. 
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t — T t-T+1 
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regressor 
fit - T) 



regressands 



next input 
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Figure 9.3-3: Time steps for the regressor and regressands when the next input 
to be computed is u(t). 



regressor tp(t — 3) 




y(t - 2) 



o 



u(t - 3) 



fii(t) = 1 




u(t - 2) 




u(t-l) 



Figure 9.3-4: Signal flows in the bank of parallel MUSMAR RLS identifiers when 
T = 3 and the next input is u(t). 
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• Besides RLS estimation of the parameters of suitable implicit prediction 
models of linear-regression type, MUSMAR performs on-line spread-in-time 
Truncated Cost Iterations. 

□ 

Main points of the section The implicit multistep prediction models of linear 
regression type can be exploited to single out implicit adaptive multistep predic- 
tive controllers based on SIORHC, GPC, or, by considering additional constraints 
to simplify the models, MUSMAR, the latter performing on-line spread-in-time 
Truncated Cost Iterations. In contrast with the explicit schemes wherein the open 
loop plant CARMA model is identified, the implicit adaptive multistep predictive 
controllers perform the separate identification of the parameters of closed-loop 
multistep-ahead prediction models of linear-regression type. 

9.4 MUSMAR as an Adaptive Reduced-Complex- 
ity Controller 

In this section we derive the MUSMAR algorithm for MIMO CARMA plants fol- 
lowing a quite different viewpoint from the one of the previous section which is 
based on implicit multistep prediction models. The reason for doing this is to show 
that MUSMAR can be looked at as an adaptive reduced-complexity controller 
requiring little prior information on the plant. 

A Delayed RHR Problem 

We consider hereafter the following regulation problem. A plant with inputs u(t) £ 
R m and outputs y(t) £ W is to be regulated. It is known that y(t) and u(t) are 
at least locally related by the CARMA system 

A(d)y(t) = B(d)u(t) + C(d)e(t) (9.4-1) 

In (1): e is a p-vector-valued wide-sense stationary zero-mean white innovations 
sequence with positive-definite covariance; A(d), B(d), C(d) are polynomial matri- 
ces; A(d) has dimension p x p and all other matrices have compatible dimensions. 
Further, 23(0) = O pxm , viz. the plant exhibits I/O delays at least equal to one. 



We assume that: 

• A~ 1 (d) [ B(d) C(d) ] is an irreducible left MFD; (9.4-2a) 

• C(d) is strictly Hurwitz; (9.4-2b) 

• the gcld's of A(d) and B(d) are strictly Hurwitz. (9.4-2c) 



We point out that, in view of Theorem 6.4-2, (2b) entails no limitation, and (2c) is 
a necessary condition for the existence of a linear compensator, acting on the ma- 
nipulated input u only, capable of making the resulting feedback system internally 
stable. 

Though it is known that the plant is representable as in (1), no, or only in- 
complete, information is available on the entries of the above polynomial matrices. 
Then, the structure, degrees and coefficients of their polynomial entries, and, hence, 
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the associated I/O delays are either unknown or only partially a priori given. De- 
spite our ignorance on the plant, we assume that it is a priori known that there exist 
feedback-gain matrices F such that the plant can be stabilized by the regulation 
law 

u(t) = F's(t) (9.4-3) 

where 

*(*) : = [ (yt~ ny )' («£?")' ]'eR" s , n s :=n u + n y + l (9.4-4) 

Hereafter s(t) will be referred to as the pseudostate or regulator-regressor of com- 
plexity (n y ,n u ). A priori knowledge of a suitable regulator-regressor complexity 
can be inferred from the physical characteristics of the plant. This happens to be 
frequently true in applications. 

In the SISO case, we may know that, in the useful frequency band, (1) is an 
accurate enough description of the plant provided that 

A(d) = 1 + ai d + ■ ■ ■ + a 9A d 9A (a dA + 0) (9.4-5) 

B(d) = d e B(d) = d l [hd +■■■ + b dB d dB ] (h ^ 0, b dB ^ 0) (9.4-6) 

with dA and dB given and the I/O transport delay £, £ = 0,1,---, unknown, 
possibly time-varying, and such that 

< £ < £m (9.4-7) 

being £m the largest possible I/O delay. In such a case, if dC denotes the degree 
of C(d), the regulator-regressor (4) corresponding to steady-state LQS regulation 
of (1) fulfills the following prescriptions (Cf. Problem 7.3-16) 

n y =max{dA- l,dC-£- 1} (9.4-8) 

n u = max{dB + £- 1,8C} (9.4-9) 
which, in turn, should £ be in the uncertainty range (7), safely become 

n v = max {dA, dC} - 1 (9.4-10) 

n u =max{8B + £ M -l,dC} (9.4-11) 

It is worth saying that in practice n y and n u seldom follow the prescriptions above 
but, more often, reflect a compromise between the complexity of the adaptive 
regulator and the ideally achievable performance of the regulated system. 

The problem is how to develop a self- tuning algorithm capable of selecting a 
satisfactory feedback-gain matrix F. To do this, we only stipulate that, whatever 
plant structure and dimensionality might be, an (n y ,n u ) pseudostate complexity 
is adequate. Let the performance index be 

C T {t) = £ {J T {t) | s} (9.4-12a) 

Mt) = ^ E [ll^ + ^ll^ + IK^IlL] ( 9 - 4 - 12b ) 

k=t-T 

s := s(t - T) (9.4-12c) 
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Assume that, except for the first, all inputs in (12b) are given by 

u(k) = F'(k)s(k) t-T <k<t-l (9.4-13) 

with known feedback-gain matrices F(k). Then the next feedback-gain matrix F(t) 
is found as follows. Consider the performance-index (12) subject to the constraints 
(13). As usual, let [s] be the subspace of all random vectors whose components are 
linear combinations of the components of s. Find 

u° := u°(t - T) G [s] or u° = F°' s (9.4-14) 

such that 

u° = argminC T (i) (9.4-15) 

u€[s] 

Then, set 

u(t) = F°'s(t) (9.4-16) 

and move to Cx{t + 1) so as to compute u(t + 1). 

The rule (12)— (16) is reminiscent of a Receding Horizon Regulation 
(RHR) scheme in a stochastic setting. A RHR rule would select the input u(t) 
at time t which, together with the subsequent input sequence u° t+1 t+T y minimizes 
some index and fulfills possible constraints. Then the input increment u(t) would 
be applied and u° t+1 t+T ^ discarded. A similar operation is repeated with t replaced 
by t+ 1 in order to find u(t+l). The adoption of a strict RHR procedure requires to 
explicitly use in the related optimization stage a plant description such as (1). Be- 
ing prevented from it by our ignorance, we are forced to adopt the delayed scheme 
(12)-(16). There we find u(t) at time t, by considering the plant behaviour over the 
"past" interval [t — T,t). In order to remind the above crucial differences between 
strict RHR and the adopted scheme (12) (16), the latter will be referred to as a 
delayed RHR scheme with prediction horizon T. 



In order to solve (15), set 



The Algorithm 

C T (t)=C T (t)+C T (t) (9.4-17a) 
^ E S{\\y(k+l)\\l y + \\u(k)\\lj s } (9.4,17b) 



k=t-T 



1 t_1 

C(t) := y E [ r R<Py£{y(k+lW(k+l)\s}+ (9.4-17c) 

k=t-T 

Trtp u £{u(k)u'(k) | s} 

where Tr denotes trace and y(k + 1) and u(k) are the orthogonal projections of 
y(k + 1) and u(k) onto [it, s], the linear subspace of random vectors spanned by 
{u,s}, 

y(k+l) := Projec y{k + 1) | [u,s] (9.4-17d) 
y(k+l) := y (k + 1) - y(k + 1) 
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and 



u(k) := Projec u(k) | [u, s] 
u(k) := u(k) — u(k) 



(9.4-17e) 



Note that ideally [u, s] — [s] because u e [s]. In practice, since we are not going to 
perform the optimization analytically but on-line using real data, we cannot rule 
out the possibility that the actual u(t — T) does not belong to [s\. This indeed 
happens whenever the inputs are of the form 



u{k) = F'(k)s{k)+n(k) 



(9.4-18) 



where rj(k) is either an undesirable disturbance or an intentional dither. In this 
respect, we shall assume that 77 is a wide-sense zero-mean white noise, uncorrelated 
with e. It is to be pointed out that, even if the constraints (13) become in practice 
(18) for k€[t-T+l,t),y*_ 

T+i an d u t-T+i depend linearly on u. Hence yl_ T+1 
and u\Z T+ i are unaffected by u. Further note that 



y(k) = £{y{k) v '}£^{ w '} v 

u(k) = s {u(k)(f'} e- 1 {<p<p'} ip 

ip := [ s' v! ]' 
In conclusion, our problem (15) simplifies as follows 

u° := u°(t -T) = arg min J 

u£[s] 

t-1 



k=t-T 



Let for i = 1,2,---,T 



y(t - T + i) = Q' iV > = 9 lU + T' iS 
Q[ := £{y{t-T + i)^}£- 1 { W 1 } 

= [n a*] 

u(t - T + i - 1) = M^ip = u, iU + A^s 
M[ := £{u{t-T+i)i P }£- 1 {<p<p'} 

= [ K iu ] 



where 

Similarly, 
where 

Obviously, in (22a) 

Mi = I m and Ai = O ngXm 
Using (21) and (22) in (20), we find 

u° = F°'s 



(9.4-19) 

(9.4-20a) 
(9.4-20b) 

(9.4-21a) 
(9.4-21b) 

(9.4-22a) 
(9.4-22b) 

(9.4-22c) 
(9.4-23a) 
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(9.4-23b) 



(9.4-23c) 



Note that by (22c), whenever ip u > 0, S = E' > 0, and hence (20) is uniquely 
solved by (23). 

In order to compute Qi and Mi we use Proposition 6.3-1 on the relationship 
between RLS updates and Normal Equations as follows. Let Qi(t), i ~ 1, 2, • • • , T, 
t = T,T +!,■■■, and dim6j(£) = dim ip, be given by the RLS updates 



(9.4-24) 



6i(t) 

t-T 



(9.4-25) 



0j(t) = ei(t-l) + P(t-T+l)p(t-T)x 

!/'(*- T + O-^t-Tje^t-l) 

f-^t-T+l) = P-^i-TJ + ^t-Ty^-T) 
P(0) > 

Then 6, (i) satisfies the Normal Equations 

"t-T 

5>(fcy(*) 

=0 

= ^^(fc) 2/ '(fc + z) + p [e J (T-i)-e 4 (t)] 

fc=0 

The following result then follows directly from Theorem 6.4-2. 

Proposition 9.4-1. Let the I/O joint process {u(k—l),y(k)} be strictly stationary 
and ergodic with bounded := £ {(p<p'} > 0. Let Qi(t) be given by the RLS updates 
(24). Then, 

lim Qi(t) = £~ x {ipip'} £ {ipy'(t ~T + i)} a.s. (9.4-26) 

Similarly, let Mj(i) be given by the following RLS updates for i = 2, 3, • • • , T 
{ Mi{t) = Mi(t-l) + P(t-T+l)<p(t-T)x 



u'(t — T + i — — <p'(t - T)Mi(t - 1) 



Then 



k M{(t) = M{(t-1) = [ I m 0„ 



lim Mi(t) = £- x {(ftp'} £ {tpu'(t — T + a.s. 



(9.4-27) 



(9.4-28) 



Putting together the above results, we arrive at a recursive regulation algorithm, 
a candidate for solving on-line the delayed RHR problem (12)— (16) . 

Theorem 9.4-1 (MUSMAR). Consider the delayed RHR problem 
(12)-(16) for the multivariable CARMA plant (1) having unknown structure (state- 
dimension, I/O deadtimes, etc.) and parameters. Then, the following recursive 
algorithm for t = T, T + 1, • • • 



P _1 (t -T+l) = P~\t-T) + ip(t - T)ip'(t - T) 



(9.4-29a) 
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Qi(t) = Qi(t-1)+ (9.4-29b) 
P(t-T+ l)(p(t - T) \y'(t -T + i)- <p'(t - T)B l (t - 1) 



Mj(t) 



Mi(t-1)+ (9.4-29c) 
P(t-T+ l)<p(t - T) \u'(t — T + i + 1) — <p'(t - T)Mi(t - 1) 



e i= [ r t (t) ]' M i= [ A'S) W (t) ]' 

T 

F\t) = -E-\t) j2 mWiit) + m^uKm 



i=i 

T 



u(t) = F'(t)s(t) 



(9.4-29d) 
(9.4-29e) 

(9.4-29f) 
(9.4-29g) 



with P(0) > and Ai(t) = O ns xm, Hi — I m , as t — > oo solves the stated RHR 
problem, whenever the joint process ip becomes strictly stationary and ergodic, with 
bounded £{(pip'} > 0. 

Theorem 1 justifies the use of the recursive algorithm (29) to solve on-line the 
delayed RHR problem (12)-(16) for any unknown plant. However, it does not 
tell whether the algorithm — assuming that it converges — yields a satisfactory 
closed-loop system. This issue will be studied in depth in the next section. 



2-DOF MUSMAR 

Our interest is to modify the pure regulation algorithm (29) so as to make the plant 
output capable of tracking a reference r(t) € R p . In this connection the following 
modifications can be made to the algorithm (29). They can be justified mutatis 
mutandis by the same arguments as in the proof of Theorem 1 . 

In a tracking problem the variable to be regulated to zero is the tracking error 

e y {k):=y{k)-r{k) (9.4-30) 

Two alternatives are considered for the choice of s(k). In the first 

s(k) := [ e v (k - n y ) ■■■ e y (k) u(k — n u ) ■■■ u(k-l)]' (9.4-31) 

and, accordingly, MUSMAR acts as a one degree-of-freedom (1-DOF) controller. 
In the second 

«(*) : = [ (y k r v ) («&)' (^)'f ( 9 - 4 - 32 ) 

and, accordingly, MUSMAR acts as a two degree-of-freedom (2-DOF) controller. 
The choice of the controller-regressor s(k) in (32) is justified by resorting to the 
controller structure in the steady-state LQS servo problem (Cf. Theorem 7.5-1 and 
Proposition 7.5-1). 

The 2 -DOF version of MUSMAR has typically a better tracking performance 
than the 1-DOF version. 
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Example 9.4-1 [GGMP92] (MUSMAR autotuning of PID controllers for a two-link robot) We 
consider a double-input double-output plant consisting of the mathematical model of a two- 
link robot manipulator in a vertical plane. The model is a continuous— time nonlinear dynamic 
system and refers to the second and third link of the Unimation PUMA 560 controlled via the 
two corresponding joint-actuators. The control laws considered hereafter pertain to a sampling 
time of 10 — 2 seconds. Although the plant is nonlinear, we use as controllers two MUSMAR 
algorithms acting individually on each joint in a fully decentralized fashion. Specifically, for the 
i— th joint, i = l,2, the corresponding MUSMAR algorithm selects the three-component feedback- 
gain vector, 

Fi(t) := [ fa(t) fa(t) fa(t) ]' (9.4-33a) 

in the control law 

with pscudostate 
and tracking error 

s yi (t) := n(t) - yi (t) (9.4-33d) 
ri(t) being the reference for the i-th joint. F,(t) is selected so as to minimize, as indicated after 
(12a), the following performance index with a 10 steps prediction horizon 

Cio(*) =£{4o(t) I «i(t-10)} (9.4-34a) 
J io(*) = i7i E [e 2 y M + e-W- & Su^k)] (9.4,34b) 

iU fc=i_10 

Omitting the argument t in the feedback gain, (33) can also be rewritten as follows 



Sui(t) := Ui(t) — Ui(t — 1) 

= F((t)si(t) (9.4-33b) 

Si (t) := [ 6 yi (t) Eyi{t-l) e yi (t-2)Y (9.4-33c) 



K Pl + K u T 3 J + d + K Di ±-(l - d) 
2(1 — d) Is 



e yl (t) (9.4-35a) 



Ui(t) 

with T 3 = 10 -2 seconds and 

2K Pt = f a - f a - 3fa T a K H = fa + fa + fa K Di = T a fa (9.4-35b) 

The control law (35a) is a digital version of the classical PID controller obtained by using the Tustin 
approximation for the integral term and backward difference for the derivative term [AW84]. Thus, 
MUSMAR in the configuration (33) and (34) can be used to adaptively autotune the two digital 
PID controllers of the robot manipulator. To this end, the reference trajectories for the two 
joints are chosen to be periodic so as to represent repetitive tasks for the robot manipulator. For 
each joint a smooth trapezoidal reference trajectory is used (Fig. 1). A payload of 7 kilograms 
is considered to be picked up at the lower rest position and released at the upper rest position 
of the terminal link. Fig. 2a and b show the time— evolution of the MUSMAR three 

feedback components, reexpressed as Kpi, Kn and Kr>i via (35b), for the two joints over a 200 
seconds run. The feedback-gains on which MUSMAR self-tunes are used in a constant feedback 
controller, one for each joint. If these two resulting fixed decentralized controllers are used to 
control the manipulator, we obtain the tracking-error behaviour indicated by the solid lines in 
Fig. 3a and b for the two joints. In the same figures the dotted lines indicate the tracking-error 
behaviour obtained by two digital decentralized PID controllers whose gains are selected via the 
classical Ziegler and Nichols trial-and-error tuning method [AW84]. Note that the MUSMAR 
autotuncd feedback— gains yield a definitely better tracking performance. We point out that, since 
the optimal feedback— gains for a restricted— complexity controller arc dependent on the selected 
trajectories, it is usually required to repeat the MUSMAR autotuning procedure when the robot 
task is changed. A final remark concerns the possible use in the two controllers of the common 
pscudostate 

s(t)=[ s[(t) s' 2 (t) }' 

si(t) and S2(t) being defined as above. In this way the controller on each joint has some informa- 
tion on the current state of the other link. This generally leads to a further improvement of the 
tracking performance. 



Main points of the section While in the previous section the MUSMAR algo- 
rithm was obtained as an implicit TCI-based adaptive regulator using the knowl- 
edge of the CARMA plant structure (degrees of the CARMA polynomials), in this 



Sect. 9.4 Adaptive Reduced-Complexity Control 



325 




Figure 9.4-2a: Time evolution of the three PID feedback-gains Kp, Kj, and Ku 
adaptively obtained by MUSMAR for the joint 1 of the robot manipulator. 
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Figure 9. 4- 2b: Time evolution of the three PID feedback-gains Kp, Kj, and Kp 
adaptively obtained by MUSMAR for the joint 2 of the robot manipulator. 
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2 DOF Robot with Ditital P.I.D. Control 
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Figure 9.4-3: Time evolution of the tracking errors for the robot manipulator 
controlled by a digital PID autotuned by MUSMAR (solid lines) or Ziegler and 
Nichols method (dotted lines): (a) joint 1 error; (b) joint 2 error. 
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section it is shown that the same algorithm is a candidate for adaptively tuning 
reduced-complexity control laws for highly uncertain plants. 



9.5 MUSMAR Local Convergence Properties 



For the implicit adaptive predictive control algorithms introduced in the previous 
two sections convergence analysis turns out to be a difficult task. One of the 
reason is that the parameters that the controller attempts to identify do not solely 
depend on the plant but also, and in a complicated way on the past feedback 
and feedforward gains. In particular, the estimated parameters in the MUSMAR 
algorithm are in a sense more directly related to the controller than to the plant, 
particularly if the former is of reduced-complexity relatively to the latter. 

Under these circumstances, implicit adaptive predictive control algorithms do 
not appear amenable to global convergence analysis. Further, we may be legiti- 
mately concerned about their possible lack of global convergence properties for two 
main reasons. First, their underlying control laws need not be stabilizing unless 
some provisions are taken. E.g., in Chapter 5 some guidelines were given on the 
prediction-horizon length in order to make a TCI-based controller stabilizing. Sec- 
ond, the on-line control synthesis is based on parameters which, in turn, depend 
on the past controller-gains. Hence, if at a given time the latter make the closed- 
loop system unstable and regressors of very high magnitude are experienced, the 
identifier gains can quickly go to zero and the subsequent estimates corresponding 
to a nonstabilizing set of controller-gains will stay constant. Since it cannot be 
ruled out that such estimates produce a nonstabilizing new set of controller-gains, 
saturations may occur such as to prevent subsequent use of the adaptive algorithm. 
This consideration lead us to conjecture that the implicit predictive controllers of 
the last two sections, based on the use of standard RLS identifiers, could be im- 
proved by equipping their identifiers with constant-trace and data normalization 
features. Such a conjecture is in fact confirmed by experimental evidence. 

Though global convergence is a very difficult task, local convergence analysis 
of implicit adaptive predictive controllers, e.g. MUSMAR, can be carried out via 
stochastic averaging methods, such as the Ordinary Differential Equation approach. 
Such an analysis is important in that it can reveal the possible convergence points 
even when plant neglected dynamics are present. E.g., the derivation of last section 
candidates MUSMAR as a reduced-complexity adaptive controller. Hereafter, by 
using the ODE method we shall uncover how MUSMAR can behave in the presence 
of neglected dynamics. 

9.5.1 Stochastic Averaging: the ODE Method 

The updating formula of many recursive algorithms has typically the form: 



where: 0(t) denotes the estimate at time t; {"fit)} a scalar-valued gain sequence; 
ip(t— 1) a regression vector; e(t) a prediction error. E.g., the RLS algorithm (6.3-13) 



0(t) = 8(t - 1) + 7 (t)ft- 1 (tMt - l)e(t) 



(9.5-la) 



TZ(t) = TZ(t - 1) + 7 (i) [tp(t - iy (t - 1) - TZ(t - 1)] 



(9.5-lb) 
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can be rewritten as in (1) if we set 

m := TTi^ ' ^ (0) := p ~ 1(0) (9 ' 5_2a) 

and 

7(*) ~ J^J (9-5-2b) 

The SG algorithm (8.7-24) can be written as 

6{t) = 6»(t-l) + a 7 (t)^ 1 (t)v5(t-l)£(t) (9.5-3a) 

q{€) = e ( t _i) + 7 (t) [||^(t-l)|| 2 -^-l)] (9.5-3b) 

with a > 0, 

:= ^ , e (0) := 9 (0) (9.5-3c) 

and j(t) given again as in (2b). 

In both cases, e(t) and tp(t — 1) depend in the usual way on the I/O variables 
of a CARMA plant 

A(d)y(t) = B(d)u(t) + C(d)e(t) (9.5-4) 

with e a stationary zero-mean sequence of independent random variables such that 
all moments exist. 

A rigourous proof of the ODE method for analysing the stochastic recursive 
algorithm is given in [Lju77b] and it turns out to be a quite formidable problem. 
We shall give a heuristic derivation of the method together with statements, without 
proof, of the main results. 

From (la) write for k G 

t+k 

6(t + k) = 6(t)+ 7(*)ft -1 (*M*-lM*) 



i=t+l 



or 



9(t + k)-0(t) 1 ^ , 



i=t+l 

- n-Hthwl £ v(i-iM0 

i=t+l 

In the above equation we have used the following approximations. First, assuming 
t large enough, we see from (lb) and (2b) that 1Z(i) and -f(i) for t + 1 < i < t + k 
are slowly time-varying and hence can be well approximated by 1Z(t) and 7(f), 
respectively. Second, the regressor ip(i — 1) and the prediction error e{i) depend on 
the previous estimates 0(j),j = i — 1, i — 2, • • •, via an underlying control law which 
has not yet been made explicit. Moreover, by (la) 0(j) is slowly time-varying. 
Hence, 

t-\-k t-\-k 

±£>(i-l) £ (i) = I 2 V (i-l,(9(t)) e (i,(9(t)) 

i=t+l i=t+l 



£{ y (t-M) £ (t,«)} Ht) 

/(*(*)) (9-5-5a) 
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where with the second approximation we have replaced the time-average over the 
interval [t + l,t + k] with the indicated expectation. This is a plausible approx- 
imation in view of the fact that, by the presence of the innovations process e in 
(3), both ip(i — l,6(t)) and s(i, 0(t)) are processes exhibiting fast time-variations in 
their sample paths. The expectation £{f(t — 1, 9)e(t, 9)} has to be taken w.r.t. the 
probability density function induced on u and y by e, assuming that the system is 
in the stochastic steady state corresponding to the constant estimate 9. 

From the above discussion we see that the asymptotic behaviour of the stochas- 
tic recursive algorithm can be described in terms of the system of ODEs 

^ = 1 (t)n-\t)f{6{t)) (9-5-5b) 

^ = l (t)[G{6{t))-1l{t)] (9.5-5c) 

where the latter equation is obtained similarly to the first with 

Q{6{t)) := £ W(t - l,6)<p'(t - l,0)} e=m (9.5-5d) 

where the expectation £ has to be carried out as indicated above. It is now con- 
venient to further simplify the above ODEs by operating the following time-scale 
change: substitute t with a new independent variable r = r(t) such that 

dr 1 

— = 7(i)~- or r = lnt (9.5-6a) 

then, (5) become 

Sl=Tl-\r)f(e(r)) (9.5-6b) 

^P- = -K(T) + g(9(T)) (9.5-6c) 
(It 

These are called the ODEs associated to the stochastic recursive algorithm (1). 

The above considerations make it plausible Result 1 below. For the sake of sim- 
plicity, the validity conditions stated there are not the most general but specifically 
tailored for our needs. 



Result 9.5-1. Consider the stochastic recursive system (1), (4), together with a 
linear regulation law 

u(t) = F'(6(t))s(t)+r)(t) (9.5-7) 

where s(t) is a vector with components from ip(t), and r](t) has the same interpre- 
tation as in (4~18). Further, assume that: 

• 0:=[e; ••• 6^ M[ ■■■ M' T ]' and F(6) are as in (3-17) with p > 
so that F(0) is a bounded rational function of the components of 9; 

• 9(t) belongs to V s for infinitely many t, a.s., V s being a compact subset of 
R™ e in which every vector defines an asymptotically stable closed-loop system 

U), (V; 

• \\ip(t — 1)|| is bounded for infinitely many t, a.s. 
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Then: 

i. The trajectories of the ODEs (6) within V s are the asymptotic paths of the 
estimates generated by the stochastic system (1), (4); 

ii. The only possible convergence points of the stochastic recursive algorithm (1) 
are the locally stable equilibrium points of the associated ODEs (6). Specifi- 
cally, if, with nonzero probability, 

lim 6{t) = 6* and lim K(t) = K* > 

(with 6* GT> s , necessarily) then (0*,7£*) is a locally stable equilibrium point 
of the associated ODEs (6), viz. 

f(6.)=O nt and G{6,) =TZ, , (9.5-8) 

and the matrix 

(9.5-9) 



n 

: in 

9=8, 

has all its eigenvalues in the closed left half-plane. 



Remark 9.5-1 

• Note that the time-scale transformation t h- > t = In t in (6a) yields a "time 
compression" in that events at large values of t take place earlier in r. This 
is an important advantage of investigating convergence of (1) through simu- 
lation of the associated ODE rather than running the recursive algorithm (1) 
itself. 

• From the validity conditions of Result 1 , in particular the role played there 
by T> s , we see that stability in the ODE method must be assumed from the 
outset and never comes as a consequence of ODE analysis. This clearly limits 
the importance of the ODE method in adaptive control. Nonetheless, the 
method is widely applicable to determine necessary conditions for algorithm 
convergence to desirable points. 

□ 



In the next example we apply the ODE method to analysing the implicit RLS+MV 
ST regulator introduced in Sect. 8.7. We commented there that global convergence 
analysis of such an adaptive regulator is a difficult task and were only able to 
establish global convergence of the SG+MV regulator. In the example below it is 
shown that, though global convergence cannot be addressed, ODE analysis is by 
all means valuable in that it pinpoints some positive real properties on the C(d) 
polynomial that must be satisfied so as to possibly achieve convergence. 

Example 9.5-1 (Implicit RLS+MV ST regulation [Lju77a]) Consider the plant (8.7-1) with 
I/O delay equal to one, along with (8.7-6)-(8.7-9) corresponding to MV regulation. In particular, 
hereafter we assume that b\ is a priori known and hence is not estimated. The implicit model 
used for RLS estimation is then 

y(t) = biu(t -l) + (f'(t- 1)6»mv + e(t) (9.5-10a) 
fit -!):=[ (vlZ^)' (<?)' ]' (9.5-10b) 
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6>mv:=[ci-oi ■■• c ft -a ft b 2 ■■■ b A ] ' e U n » (9.5-10c) 
with ng = 2h — 1. Consequently, the RLS algorithm is given by (1) and (2) with 

e(t) = y(t) - b lU (t - 1) - <p'(t - l)8(t - 1) 

Further, the plant input is 



u(t) = -^ V (t) (9.5-11) 



Hence, by (11) 

e(t,e) = y(t,6) 



where 

Observe that 



bw{t -1,9) + e°'ip{t -1,6) + C(d)e(t) 
(6° - 6)' p(t - 1,6) + C(d)e{t) 

(6° - 9mv)' <f{t -1,6) + (6»mv - 6)' <p{t -1,6) + C(d)e(t) (9.5- 12a) 

e°:=[-ai ••• -Oft b 2 ■■■ b A ]' (9.5-12b) 
6°-6mv = -[ ci ••• c A lx (ft_!) ]' 



(0° - 0mv)' V(* - M) = -cij/(t-l,6») c A y(t-h,e) 

= [1-C(d)]y(t,6) 
= [1-C{d)\e(t,6) 

Therefore, using this in (12a) 

s(t, 6) = [1 - C(d)]e(t, 6) + (6 M v ~ 6)' <p{t -1,6) + C(d)e(t) 

or 

C(d)e(t, 6) = {6 M y - 6)' ¥>(* -!,«) + C(d)e(t) 

Hence 

e(t, 6) = (6 M v - 6)' <p c (t -1,6) + e(t) 
where ip c (t — 1,6) is the filtered regressor 

ipjt - 1, 6) := H(d)<fi(t - 1, 6) (9.5- 13a) 

H(d) := -J- (9.5-13b) 
C(d) 

Thus, (5a) becomes 

f(6) = £{<p(t-l,6)e(t,6)} 

= £{<p(t-l,6)<p' c (t-l,6)} (6 MV -6) + £{<p(t-l,6)e(t)} 
= S{<p(t-l,6)<p' c (t-l,e)}{6 M v-6) (9.5-13c) 
from which the associated ODE follows 

6\t) = -■R- 1 {t)M{6{t))[6{t)-6 M v] (9.5-14a) 
7t(r) = -1Z(t) +G(6{t)) (9.5- 14b) 

where the dot denotes derivative, 

M(6) :=£{p(t- l,6)<p' c (t- 1,9)} (9.5-14c) 

and g(9) as in (5d). 

The equilibrium points 9*,TZ t , of (14) are the solutions of the following algebraic equations 

M(d*)(6*-6 M v) = O ng (9.5-15a) 

R*=g(6 f )>0 (9.5-15b) 

A comment on (15b) is in order. If h is larger than the minimum plant order and hence ip(t— 1, 9) 
need not be a full rank process, some measure must be taken so as to make Q(9(t)) positive 
definite, e.g. a small positive definite matrix can be added within the brackets in (lb). 
To proceed we have to avail of the following lemma. 
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Lemma 9.5-1. Let H{d) be SPR fCf. (6.4-34)). Then, for £ £ R n » 
M(6)e = O ng => >p'(t- 1,8)£ = 

Proof M(6)t = O ne implies = I 1 M(8)t = £{z(t - 1, 8)z c (t -1,9)} where 

z(t - 1,0) := £'ip(t- 1,8) and z c (t - 1,8) := H(d)z(t - 1,8) 

Thus, if ^ z (d) denotes the spectral density function of z(t — 1,6), from Problem 7.3-5 it follows 
that 

= £'M(6)£ = (H(d)V z {d)) = i {(H{d)V z (d) + H* (d)V*{d))} 

= ±((H(d) + H*(d))* z (d)) [(7.3-24b)] 

= -*- /" Re [ff (e**)] ¥, (e**) dw [(3.1-43)] 

Since * z (&> w ) = * z (e"^) > 0, SPR implies that ¥ z (e^) =0. □ 

Then, by Lemma 1, assuming l/C(d) SPR or, equivalcntly, C(d) SPR, (15a) implies that ip'(t — 
1,9,) {6, - 6mv) = 0. Hence, from (12a) 

y(t,8,) = tp'(t-l,8,)(8° - 6 MV ) + C(d)e(t) 

= [1-C{d)]y(t,6,) + C(d)e(t) 

i.e. 

C(d)y(t,9,) = C(d)e(t) or y(t,8,) = e(t) (9.5-16) 

We can conclude that: 

r , n all the equilibrium points of the ODE 

Vw e f-7T ir) ^ associated to the implicit RLS+MV (9.5-17) 

' regulator yield the MV regulation law 

This means that the implicit RLS+MV ST regulator is weakly self-optimizing w.r.t. the MV 
criterion. Assume now that h is not only an upper bound of the minimum plant order but that 
it equals the latter, viz. 

h = max {n a , ni, n c } (9.5-18) 

Then, under (18), (16) implies 8* = 8mv- We study next local stability of 8mv under the 
assumption (18). We have from (13c) 

f(6) = -M(8) (8 - 8 MV ) 

and 

-i df{6) 



Kmv d8 



= -TImvM(8mv) 
= -G' 1 (0mv) M (0 MV ) (9.5-19) 



According to Result 1 in order to show that #mv ' s a possible convergence point, we have to prove 
that all the eigenvalues of (19) are in the closed left half-plane. To this end, we shall use the 
following lemma. 

Lemma 9.5-2. Let S and M be two real-valued square matrices such that 

S = S' > and M + M' > 
Then, —SM is a stability matrix. 

Proof Consider the linear differential equation x(t) = —SMx(t). Since for P = S^ 1 we have 

P(-SM) + (-SM)'P + (M + M') = 

Hence, since P = P' > 0, it follows that {-SM, H), H' H = M + M' > 0, is an observable pair 
and that -SM is a stability matrix. (Cf. Problem 2.4-3). □ 



We can now prove that in the implicit RLS+MV ST regulator w.s. self-tuning occurs. 
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Proposition 9.5-1. Consider the implicit RLS+MV ST regulator applied to the given CARMA 
plant. Let h be set as in (18). Then, provided that C(d) is SPR, the only possible convergence 
point is 0mv- 

Proof By the proof of Lemma 1 we have for H(d) = 1/C(d) 

i{M + M')t = - r Re [H (e^)] * z <e juj ) dw > 

whenever C(d) is SPR. Then, applying Lemma 2, we conclude that (19) is a stability matrix. □ 

Via the ODE method it can be also shown [Lju77a] that, with the choice (18) and under the validity 
conditions of Result 1, the implicit RLS+MV above does indeed converges to 8yiv provided that 
the same SPR condition as in Theorem 6.4-36 holds 

m " i is SPR (9 - 5 " 20) 

Note that (20) is a stronger condition than C(d) being SPR. 

In [Lju77a] it is also shown via the ODE method that the variant of the adaptive regulator 
discussed above with RLS substituted by the SG identifier (3) with a = 1, under the general 
validity conditions of Result 1 does indeed converges to the MV control law provided that C(d) 
is SPR. Note that this conclusion agrees with Theorem 8.7-1 where global convergence of the 
SG+MV ST regulator was established via the stochastic Lyapunov function method. It has to be 
pointed out that the condition 9yrveV s implies that the plant must be minimum-phase. 

Problem 9.5-1 (ODEs for SG-based algorithms) Following a heuristic derivation similar to the 
one used for RLS— based algorithms, show that the ODEs associated with SG-based algorithms 
are given by 

d0(r) _ a 



dr r(r) 
dr(r) 



f(0(r)) 
r(r)+g(e(r)) 



with 



and f(8) again as in (5a) 



dr 

9 (0) = e{Mt-i,e)f} 



Problem 9.5-2 (ODE analysis of the SG+MV regulator) Use the ODE associated to the SG- 
based algorithms in Problem 1 and conditions similar to (8), (9), in order to find the possible 
converging points of the SG+MV ST regulator (8.7-24), (8.7-25). 

9.5.2 MUSMAR ODE Analysis 

Hereafter a SISO CARMA plant is considered 

A(d)y(t) = B(d)u(t + C(d)e(t) (9.5-21) 

with all the properties as in (2-1). In (21) I > 1 is the I/O delay, and e is a 
stationary zero-mean sequence of independent random variables with variance o\ 
and such that all moments exist. Moreover, let 

n = m&x{dA(d),dB(d) + I, dC(d)} 

Associated with (21), a quadratic cost-functional is considered 

C T := \e {L T } (9.5-22a) 



1 T_1 

L T := - [y 2 (t + 1 + l ) + P« 2 (* + 0] (9.5-22b) 



i=0 



Sect. 9.5 MUSMAR Local Convergence Properties 



335 



where p > 0. We recall that in the previous sections the MUSMAR algorithm has 
been introduced in order to adaptively select a feedback-gain vector F such that 
in stochastic steady-state 

u{t) = F's(t) + r]{t) (9.5-23) 

minimizes in a rcccding-horizon sense the cost (22) for the unknown plant. In (23), 
rj is a stationary zero-mean white dither sequence of independent random variables 
with variance c 2 independent of e, and s(t) denotes the pseudostate made up by 
past I/O and, possibly, dither samples. In order to deal with a causal control 
strategy, the pseudostate s(t) is made up by any finite subset of the available data 

i t = {vW-\v t - 1 } 

Problem 3 shows that the control law of the form 



u(t) = u(t) + r](t) (9.5-24a) 
minimizing in stochastic steady-state the cost Coo, with u(t) G a {I*}, is given by 
R (d)u(t) = -S (d)y(t) + C(d)r,(t) (9.5-24b) 

Problem 9.5-3 Show that the control law of the form (24a), minimizing in stochastic steady- 
state the cost Coo, is given by (24b). [Hint: Consider any causal regulation law (Cf. (7.3-72)) 

Ro(d)u(t) = -S (d)y(t) + »j(t) + v(t) 

where Ro(d) and So(d) are the LQS regulation polynomials 

A(d)R [) (d)+d e B(d)S (d) = C(d)E(d) 

E*[d)E(d) = pA*(d)A(d) +B*(d)B(d) 

and v any wide-sense stationary process such that v(t) S <r{/*}. Following the same lines used 
after (7.3-72), show that under the above control law 

1 



where Clqs is the LQS cost for n(t) 
to show that Coo is minimized for 



Clqs + £ 
v(t) 



C(d) 



v(t) + V (t)} 



0. Take into account the constraint v(t) S <r {/'}, 



v(t) = -C(d)£ { I 
Hence, conclude that v(t) = [C(d) — 1] f](t). ] 



(t) 



C(d) 



I* 



In the absence of dither, i.e. <r^ — 0, (24b) coincides with the optimal regulation 
law for the LQS regulation problem under consideration. Eq. (24b) can be written 
in the form (23) with / e R 2 «+ m an d 



s(t) := 

with (Cf. Problem 7.3-16) 



( y \-^)' (utT)' my 



m - 



n-l p=0 
n p > 



(9.5-25a) 



(9.5-25b) 



This result shows that, in order to make the MUSMAR algorithm compatible with 
the steady-state LQS regulator, past dither samples should be included in the re- 
gressor. In practice, an unknown input dither component is unavoidably present 
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due to the finite word-length of the digital processor implementing the algorithm. 
Consequently, in order to deal with this more realistic situation, we assume here- 
after that the pseudostate vector only includes past I/O samples and no dither, 
viz. 

*(*):=[ (»|- fl+1 )' «?)']' (9-5-26a) 

where n and rh are two nonncgative integers. In case h is the presumed order of 
the plant, fh can be chosen according to the rule 

This situation will be referred to as the unknown dither case. Problem 4 shows that, 
when the pseudostate is given by (26a) with n > n and rh as in (26b), the control 
law (23), minimizing in stochastic steady-state the cost for the CARMA plant 
(21), is given by 



R (d)u(t) = -S (d)y(t) + n(t) - C(d)£ 
Since, for cr^/al < 1, 



C(d) 



s(t) \ (9.5-27) 



2 



: *^ 



C{d) 

(27) tends to the steady-state LQS optimal regulation law as — > 0. 

Problem 9.5-4 Show that, when the pseudostate is given by (26a) with h > n and m as in 
(26b), the control law (23), minimizing in stochastic steady-state the cost Coo for the CARMA 
plant (21), is given by (27). [Hint: Proceed as indicated in the hint of Problem 1 by replacing 
J* with s(t). } 

Problem 9.5-5 Prove that in the unknown dither case, for any h and rh, and under a stabilizing 
regulation law, the pseudostate covariance matrix $> s := £{s(t)s'(t)} is always positive definite. 
[Hint: Construct a proof by contradiction. ] 

Remark 9.5-2 As Problem 5 shows, in the unknown dither case, for any n 
and under a stabilizing regulation law, the pseudostate covariance matrix ty s := 
£{s(t)s'(t)} is always strictly positive definite. This property makes it convenient 
for the ODE analysis to assume the presence of a nonzero dither in (23). However, 
in practice, the dither need not be used provided that the identifiers are equipped 
with a covariance resetting logic fix or variants thereof. □ 

MUSMAR is based on the set of 2T prediction models (9.3-11), one for each output 
and input variable included in the cost-functional (22), viz. 



(9.5-28) 



y(t + i + £) = eiu{t) + T' iS {t)+ei{t + i) 
u(t + i-l) = mu(t) + A<s(i) + v>i(t + i - 1) 

where I < I. Notice that all the models in (28) share the same regressor 

¥>(*):=[«(*) s'(t) ]' 

In the following, we set for simplicity I = 1. The parameters of the prediction 
models in (28) are separately estimated via standard RLS identifiers as shown 
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in Fig. 3-4. The estimate updating equations are therefore given by (3-16): for 
i = l,---,T 



9i(t) 

ri(t) 



and, for i = 2, • • • , T 



0i(t-l) 
T l (t-1) 



K(t - T) x (9.5-29a) 
y(t - T + i) — 6i{t - l)u(t — T) — r{(t - l)«(t - T) 



Ai(t) 



Mi(t-l) 
Ai(t-l) 



tf(t - T) x (9.5-29b) 
u(t — T + h) - m(t - l)u(t — T) — Aj(t - l)«(i - T) 



In (29) - T) = P(t - T + l)ip(t - T) denotes the updating gain. Obviously 
Hi(t) = 1 and A (t) = 0. Finally the control signal is chosen, at each sampling 
instant t, according to (Cf. (3-17)) 



u(t) = F'(t)s(t)+n(t) 

T 

F(t) := -E-\t) Mt)T t (t) + m {t)Ht)\ 



(9.5-30) 
(9.5-31) 



(9.5-32) 



Conditions on the form of the feedback F in (23), under which the regulated 
CARIMA plant (21) admits in stochastic steady-state the multiple prediction 
model (28), have been given in Theorem 3-1. Hereafter, an ODE local conver- 
gence analysis of the MUSMAR algorithm is carried out. No assumption is made 
on the pseudostate complexity h or the true I/O delay. The main interest is in the 
possible convergence points of the MUSMAR feedback-gain vector F. 

Since, if F converges to a stabilizing control law, ^ s converges to a strictly 
positive definite bounded matrix, Theorem 6.4-2 shows that also the parameter 
estimates 6(t) := ^i{t), Hi(t), Ai(t)}J =1 converge. Thus, the only possible 

convergence feedback gains are the ones provided by (31) and (32) for a given 
parameter estimate 9(t) at convergence. 

Since the multipredictor coefficients of (28) are estimated via standard RLS 
algorithms, the asymptotic average evolution of their estimates is described by the 
following set of ODEs: (Cf. (6)) 



m(r) 



0i(r) 
ti(r) 



n- l (r)f^(r) 
n- l (r)f Mi (r) 



Mr) 
Ai(r) 

ii(T) = -n(T) + n v (T) 



where 



/e t (r) 



E{<p(t) [y(t + i)~ 

[e^F^) + t 1 (t)}' s(t) - e^Mt) 



(9.5-33a) 

(9.5-33b) 
(9.5-33c) 

(9.5-34a) 



? (r)} 
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/m,(t) := £{ip(t) «(* + i — 1) — (9.5-34b) 
\Mi(r)F(T) + Ai(r)]'s(t) - W (r)„(t)] | F(r)} 



7^(t) := 



£M*y(t)|F(r)} 

F'(r)^(r)F(r)+^ F'(r)ft 8 (r) 



^(r)F(r) 



(9.5-34c) 



(9.5-34d) 



and F(r) is as in (31) with t replaced by r. In (34) £{■ \ F(t)} denotes the 
expectation w.r.t. the probability density function induced on u and y by e and 
rj, assuming that the system is in the stochastic steady-state corresponding to the 
constant control law 

u(t) = F'(r)s(t) + rj{t) (9.5-35) 
It is now convenient to derive the ODE for F(t) from (33)-(35). 

Lemma 9.5-3. Consider the ODEs (33)-(35) associated with the MUSMAR al- 
gorithm (29)-(32). Then, the ODE associated with the MUSMAR feedback-gain 
vector can be written as follows 



F(r) = -E- 1 (T)TZ- 1 (r)p(T)+o(\\F(r) 



(9.5-36) 



where F(t) := F(t) — F* ; F* denotes any equilibrium point of (36); o(||x||) is such 
tfiaHim^o[o(||z||)/||z||] = 0; 



p(r) := a n 2 ^2[R,y V (i;T)TZ ys (i;T) + pTZ un (i - l;T)TZ us (i - 1;t)} (9.5-37) 



and 

K rn (i;T) :=£{y{t + i)r,{i)\F{T)} 
n ys {i-T) :=£{y{t + i)s{t) | F(t)} 
with similar definitions for lZ ur , (i — 1 ; r) and 1Z US (i — 1; r) . 



(9.5-38a) 
(9.5-38b) 



Proof Premultiplying both sides of (33a) by TL(t) = TL v {t) — 1Z(t) and using (35), we have (t 
is omitted) 



F's(t) + r t (t) 
s(t) 



[y(t + i) - [9iF + Ti}' s(t) - 6iv(t)} 



F(t) 



F'TZys (i) - F'tls [6iF + Fi] + K yv (?) - tf9i 
FLys(i) -lis [diF + Ti] 

where TZ ys and TZy V are defined in (38). Recalling (34c), we have also 



[F'n a F + tf] Oi + F'TZst, 



If we define 



lis \F9~i + f ;] 

[ Kt G\ ]':=7t[ 9\ P. }' 



(9.5-39) 



(9.5-40) 



(9.5-41) 
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and 

9i :=1Z ys (i) -1Z S [OiF + Ti] (9.5-42) 
and taking into account (40), (39) can be rewritten as follows: 

F'Ks [f6\ + f;] + afa = F'g t + K yv (i) - o-fa + fQ 



lis [Fdi + Tij =g i + G l 
Taking into account the latter into the former, one gets 

F' [9i + Gi] + c%k = F'g % + TZ yv (i) - <rpi + K t 
Thus (39) can be rewritten as 

0i = -0i + <r- 2 K yv (i) + a' 2 [Ki - F'Gi] (9.5-43a) 

Us [F0i + f j] =9i + Gi (9.5-43b) 
In a similar way, (33b) can be rewritten as 

Ai = W + a- 2 TZ uv (i - 1) + a' 2 [Vi - F'Hi] (9.5-44a) 

Ks [Ffn + Aj] =hi + Hi (9.5-44b) 

where 

[Vi H( }' :=K[ in A< ]' (9.5-45) 

hi :=K us (i-l) -KslviF + Ai] (9.5-46) 
Taking into account (31), the corresponding ODE for F(r) is obtained 

F(r) = -J- y g{e i [e i F + f i ] + 



p/ii [/ijF + A,] + e\ [Vi + ^f] + p/ii b« + w^] | 



= -^T^'WpW-rW] (9-5-47) 
where the last equality follows from (43) and (44) if p(r) is as in (37) and 

r(r) := £ {a- 2 [Ki(r) - F'(r)Gi{r)} HJ^r) [K ys {i; r) + G 4 (t)] + 
i=l L 

pa- 2 [V5(t) - F'(r)Hi(T)] TZ-^r) [K us (i - 1;t) + Hi(r)] + 
ct" 2 -^- 1 [TZ y ri(i; r)Gi(T ) + pKuri(i — 1; T )Hi(r )] - 

9i(T) [«i(T)F(T) + f i(T)] - p/*i(T) [/*i(T)F(T) + Ai(T)] } 

Now it is to be noticed that if T* = {9* , F* , p* , A* } T =1 denotes any equilibrium point of (33) 
and, according to (31), F* the corresponding feedback-gain vector and F(r) := F(t) — F* , then 

Ki(r) - F'(r)Gi(r) = a 2 o(\\F(r)\\) 
V l {r)^F'{r)H l {r)=a 2 o(\\F(r)\\) 

Gi(T) = (||F(T)||) 

H i {r)=o{\\F{r)\\) 
6i [diF + Ti] =o(\\F(r)\\) 
i H [( l iF + Ai] = o(\\F(r)\\) 

Consequently, 

r(r) = o(||F(r)||) (9.5-48) 

In order to give a convenient interpretation to (37), the following lemma is intro- 
duced. 
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Lemma 9.5-4. Let x(d; r) = x(d;F(r)) be the closed-loop d- characteristic poly- 
nomial corresponding to F(t). Then 



a v 2, R yri {i\T) 



d e B(d) 



X{d;r) 
A(d) 



(9.5-49a) 



(9.5-49b) 



_x(d;r)_ 

where [H(d)]i denotes the i-th sample of the impulse response associated with the 
transfer function H(d). 

Proof In closed-loop (35) can be written as 

R(d; r)u(t) = -S(d; r)y(t) + v (t) 

Consequently, if 



we find 



X (d; t) = A(d)R(d; r) + d l B(d)S(d; r) 
, N d l B(d) , s C(d)R(d; t) , s 

y(t) = —Tr-K^t) + , / ' e(t) 



X(d; r) 



X(d;r) 



A(d) , C(d)S(d; t) , N 
Since rj and e are uncorrelated, (49) easily follow. 

According to (49), (37) can be rewritten as follows: 



J2 £ 



d l B{d) 



y{t + i)s{t)+ 



A(d) 



_x{d;r) 
pu(t) 



i-1 

d e B(d) 



x(d; t) 

A(d) 



u(t + i-l)s(t) Fir) 
s(t) 



s(t) 



|T-1 



F(r) 



(9.5-50) 



where H(d)\T denotes the truncation to the T-th power of the power series expan- 
sion in d of the transfer function H(d), viz. 

T oo 
H(d)\ T = ^ if H ( d ) = H h i d ' 



i=0 



i=0 



It will now be shown that (50) can be interpreted as the gradient of the cost (22) 
in a receding-horizon sense. In order to see this, let us introduce the following 
receding-horizon variant of the cost (22) 



C T {F,l) := T-^iLriFJ)} 



(9.5-51a) 



1 T 

L T (F,l) := -Y J [v 2 (t + z) + pu 2 {t + l -l) 



u(k + t)= F's{k)-u{t) = l's(t) 



(9.5-51b) 
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The idea is to evaluate the cost assuming that all inputs, except the first included 
in the cost, are given by a constant stabilizing feedback F for all times and since 
the remote past. Then, denoting by ViCt{F, I) the value taken on at I = F by the 
gradient of Ct(F, I) w.r.t. I 



ViCt(F,1) :-- 



dC T (F,l) 



dl 



(9.5-52) 



i-F 



we get the following. 



Lemma 9.5-5. Let F(t) be a stabilizing feedback for the plant (1), and p(r) as in 
(50). Then, 

p{t)=TViC t {F,1) (9.5-53) 



Proof Let 
Thus, for all k 



u(t) = l's(t) = F's(t) + (I' - F')s(t) 
u(k) = F's(k) + v(t) tik 



R(d; F)u(k) = -S(d; F)y(k) + v(t) t , k 
where v(t) := (I — F)'s(t). Consequently, if 

X (d; F) = A{d)R(d; F) + d l B{d)S(d; F) 



we find 



y(k): 
u(k) 



X(d; F) 
x(d; F) 



C(d)R(d;F) 



v (t)t,k 



x(d;F) 
C(d)S(d; F) 
x(d;F) 



e(k) 



Thus 



dC T (F,l) 



dl 



,dy(t + i) , N , 8u(t + i- 1) , ,' 

f(* + » 77s S (t) + pu(t + i - 1) y —— '-s(t) 



Since 



dv(t) 

dy(t + i) 
dv(t) 

du(t + i) 
dv(t) 



dv(t) 



l=FJ 



d e B(d) 



X(d; F) 
A{d) 



x(d;F)\ 



taking into account (50), (53) follows. 

Taking into account (36) together with Lemma 5, we get the following result. 

Proposition 9.5-2. The ODE associated with the MUSMAR feedback-gain vector 
is given by 



F(r) = [E(T)}- 1 n; 1 (T)TV l C T (F(T),l)+o(\\F(r) 



(9.5-54) 



Remark 9.5-3 Since U s (t) > and S(r) > for p > 0, (54) for p > implies 
that the equilibrium points F of the MUSMAR algorithm are the extrema of the 
cost Ct in a receding-horizon sense, viz. Ct (F,u(t) — F's(t)^J is an extremum of 
Ct (F, u(t) — l's(t)), where s(t) denotes the pseudostate at time t corresponding in 
stochastic steady-state to the feedback F. Such a conclusion holds true irrespective 
of the plant I/O delay i and the regulator complexity n. □ 
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In order to establish which equilibrium points are possible convergence points of 
the MUSMAR algorithm, let us consider the cost in stochastic steady-state 



1 



C(F) = -E {y\t) + pu\t) | u(k) = F's(k)} 



(9.5-55) 



as a function of the stabilizing constant feedback F. As shown in Problem 6 

dC(F) 

~d e B(d) 



V F C(F) 



OF 



(9.5-56) 



X(d; t 



-s{t) 



+ pu{t) 



A{d) 
x(d; t) 



s{t) 



Problem 9.5-6 Verify that (56) gives the gradient of the stochastic steady-state cost (55) with 
respect to a stabilizing constant feedback F. 

Thus, comparing (50) with (56) and taking into account the dependence of y(t), 
u(t) and s(t) on e, (50) is seen to be a good approximation to (56) whenever 



md,F)} 



2(T-t) 



> 1 



(9.5-57) 



where A[x] denotes any root of %. Therefore, in a neighbourhood of any equilibrium 
point satisfying (57), the ODE (54) can be approximated by 



F(r) = - ^(t)]- 1 n-\r)TV F C{F{r)) + o(\\F(r) 
The above results are summarized in the following theorem. 



(9.5-58) 



Theorem 9.5-1. Consider the MUSMAR algorithm for any I/O delay I, an arbi- 
trary strictly Hurwitz C(d) innovations polynomial, and any pseudostate complexity 
h. Then: 

i. For any T > £, MUSMAR equilibrium points are the extrema F* of the 
receding-horizon variant (51) of the quadratic cost; 

ii. Amongst the equilibria F* giving rise to a closed-loop system with well- 
damped modes relative to the regulation horizon T such that (50) can be re- 
placed by (56), the only possible MUSMAR converging points for any p > 
approach the local minima (y 2 F C(F*) > 0) or ridge points (y 2 F C(F*) > 0) of 
the cost (55). 

Proof Part (i) is proved in Remark 3. Part (ii) is proved by Result 1 according to which the only 
possible convergence points of a recursive stochastic algorithm are the locally stable equilibrium 
points of the associated ODE. Since in (58), for p > 0, H(r) > and 1Z s (t) > 0, the conclusion 
follows. 

Remark 9.5-4 The relevance of Theorem 1 is two-fold. First, since no assumption 
was made on the I/O delay or the pseudostate complexity n, it turns out that, 
if T is large enough, the only possible convergence points of MUSMAR tightly 
approach the local minima of the criterion even in the presence of unmodcllcd plant 
dynamics. Moreover, this holds true irrespective of any positive-real condition, 
[MZ84], [MZ87], though MUSMAR is based on RLS (Cf. Proposition 1). □ 
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It is difficult to characterize the closed-loop behaviour of the plant for n < n. 
Conversely, if the feedback control law has enough parameters, then, according to 
[Tru85], the minima of C(F) are related to steady-state LQS regulation. More 
precisely: 

Result 9.5-2. //, in addition to the assumptions in (ii) of Theorem 1, h> n and 
the dither intensity is negligible w.r.t. that of the innovation process, C(F) has a 
unique minimum coinciding with the steady-state LQS feedback. 

Therefore, from Theorem 1 and Result 2, it follows that, if T is large enough in 
the sense of (57), <r^ -C a\, and n> n, MUSMAR has a unique possible convergence 
point that tightly approximates the steady-state LQS feedback. 

9.5.3 Simulation Results 

The results of the above ODE are important in that they show that if the algorithm 
converges, then under general assumptions, it converges to desirable points. Thus, 
the analysis allows us to disprove the existence of possible undesirable convergence 
points. However, ODE analysis leaves unanswered fundamental queries on the 
algorithm. Among them, it is of paramount interest to establish if the algorithm 
converges under realistic conditions. Since in this respect any further analysis 
appears to be prevented, we are forced to resort to simulation experiments. 

In all the examples the estimates are obtained by a factorized U-D version of 
RLS with no forgetting (Cf. Sect. 6.3); the innovations and dither variance are, 
respectively, 1 and 0.0001; and simulation runs involve 3000 steps. 



Example 9.5-2 Consider the plant 

y(t + 1) + 0.9j/(t) + ey(t - 1) = u(t) + e(t + 1) - 0.7e(t) 

with e = —0.5 and p = 1. This is a second— order plant. However, it is regulated under the 
assumption that it is of first order, the term in e being considered as a perturbation. Hence the 
controller has the structure 



u(t) = [ h h] 



u(t- 1) 



Fig. f (a)-(c) shows the evolution, in the feedback parameter space, of the feedback vector F(t) 
for T = 1, 2 and 3, superimposed to the level curves of the unconditional quadratic cost, £{y 2 (t) + 
u 2 (t)}, constrained to the chosen regulator structure. For T = 1, convergence occurs to a point 
far from the optimum (where the cost is twice the minimum). For T = 2, MUSMAR is already 
quite close to the optimum and, for T = 3 the result is even better. 



Example 9.5-3 Consider the sixth-order plant 

y(t + 6) - 3.102y(t + 5) + 4.049y(t + 4) - 2.974j/(t + 3)+ 

1.356j/(t + 2) - 0.37j/(t + 1) + 0.0461j/(t) = 

= 0.01u(t + 5) + 0.983«(t + 4) - 1.646u(t + 3) + 

1.1788u(t + 2) - 0.3343u(t + 1) + 0.0353«(t) + e(t + 6) 

with the proportional regulator 

<t) = fy(t) 

and p = 0. In this example, the unconditional cost C(F), as shown in Fig. 2 exhibits a finite 
maximum between two minima. When / is held constant at -0.5 for the first 100 steps, the 
feedback-gain converges to the indicated squares according to various choices of the control horizon 
denoted by T\. When / is held constant at -0.65, it converges for T = 6 to a value close to the 
other minimum. No convergence to the local maximum is observed, even when the initial feedback 
is close to it. 
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Example 9.5-4 Consider the plant 

y(t + 4) - 0.167y(i + 3) - 0.74y(t + 2) - 0.132y(* + 1) + 0.87jj(t) = 

= 0.132u(t + 3) + 0.545u(t + 2) + l.ll7u(t + 1) + 0.262u(t) + e(t + 4) 

This model corresponds to the fexiblc robot arm described in [Lan85] . It is a nonminimum-phase 
plant with a high-frequency resonance peak. With a reduced complexity two— term regulator 



u(t) = [ h fa] 



y(t) 

!/(*-!) 



and p = 10 4 , the unconditional performance— index exhibits a narrow "valley" from [0 ] to 
the minimum at [ -0.787 0.86 ]. Fig. 3 shows that MUSMAR, for T = 5, converges slowly but 
steadily to the point [ —0.677 0.753 ], close to the minimum (a loss of 1.28, against 1.26 at the 
optimum). 

For both plants of Examples 3 and 4, the use oiT—d= 1 yields unstable closed-loop 
systems. 

Main points of the section Under stability conditions, the asymptotic behaviour 
of many stochastic recursive algorithms, such as the ones of recursive estimation 
and adaptive control, can be described in terms of a set of ordinary differential 
equations. The method of analysis, based on this result, called the ODE method, 
though not capable of yielding global convergence results, it is by all means valu- 
able in that it allows us to uncover necessary conditions for convergence. Although 
the feedback-dependent parameterization of the implicit closed-loop plant model 
makes MUSMAR global convergence analysis a formidable problem, ODE analy- 
sis enables us to establish local convergence properties of the algorithm. These 
results reveal that, even in the presence of any structural mismatching condition, 
MUSMAR equilibrium points coincide with the extrema of the cost in a receding 



346 



Adaptive Predictive Control 




fi 

Figure 9.5-3: Time behaviour of MUSMAR feedback parameters of Example 4 
for T = 5. 

horizon sense. Further, as the length of the prediction horizon increases, MUS- 
MAR possible convergence points approach the minima of the adopted quadratic 
criterion. 

9.6 Extensions of the MUSMAR Algorithm 

9.6.1 MUSMAR with Mean-Square Input Constraint 

In all control applications, the actuator power is limited. It is therefore important 
to explicitly take into account such a restriction in the controller design specifi- 
cations. This can be done by adopting either a hard-limit input constraint or a 
mean-square (MS) input costraint approach. These are two possible alternatives 
and the most convenient use of which depends on the application at hand. A hard- 
limit input constraint leads to a difficult nonlinear optimization problem. In this 
connection, approximate solutions are proposed in [TC88], [Toi83a], [B6h85]. In 
[TC88] an adaptive GPC yielding an approximate solution to a Quadratic Pro- 
gramming problem is considered. In [Toi83a] an approximation to the probability 
density function of the plant input is used. In [B6h85] spread in time Riccati iter- 
ations are considered. For the alternative approach, [Toi83b] considered an input 
MS-constraint. Specifically, an algorithm was proposed by combining the GMV 
self tuning regulator with a stochastic approximation scheme. Though appeal- 
ing for its simplicity, this algorithm has drawbacks (nonminimum-phase, unstable 
plants, and/or time-varying I/O delay), inherited from the one-step ahead cost. 

In this section we study an MS input constrained adaptive control algorithm 
whose underlying control law is capable of overcoming the above drawbacks. The al- 
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gorithm is obtained by combining conveniently the MUSMAR algorithm discussed 
in the previous sections with the stochastic approximation scheme of [Toi83b]. 
Hereafter, this algorithm is referred to as CMUSMAR (Constrained MUSMAR). 
The main interest is in its convergence properties. Local convergence results are 
obtained. The strongest of them asserts that the isolated constrained minima of the 
underlying steady-state quadratic cost are possible convergence points of CMUS- 
MAR. This conclusion holds also true in the presence of plant unmodelled dynamics 
and unknown I/O transport delay. The study is carried out by using the ODE con- 
vergence analysis of Sect. 5 and singular perturbation theory of ODEs [Was65], 
[KK086] . The actual convergence of CMUSMAR to the possible equilibrium gains 
predicted by the theory is explored by means of simulation examples. 

Formulation of the problem Consider the CARMA plant 

A(d)y(t) = B(d)u(t) + C(d)e(t) (9.6-1) 

with all the properties in (2-1), and 

n = max {dA(d), dB(d), dC(d)} . 

Let also e be a sequence of zero-mean, independent, identically distributed random 
variables such that all moments exist. A linear control regulation law 

R(d)u(t) = -S(d)y(t) (9.6-2) 

is considered for the plant (1). In (2) R(d) and S(d) are polynomials, with R(d) 
monic. Eq. (2) can be equivalently rewritten as 

u(t) = F's(t) (9.6-3) 

where F is a vector whose entries are the coefficients of R(d) and S(d), and s(t) 
the pseudostate (Cf. Remark 2-1), given by 

s(t)=[ (y*"")' («*:?)' ]' (9.6-4) 
The following problem is considered. 

Problem 1 Given c 2 > 0, find in (3) an F solving 

min lim £ {y 2 {t)\ (9.6-5) 

F t— >oo 



subject to the constraint 



lim £{u 2 {t)} < c 2 (9.6-6) 



According to the Kuhn- Tucker theorem [Lue69], Problem 1 is converted to the 
following unconstrained minimization problem. 

Problem 2 Given c 2 > 0, find in (3) an F solving 



min£(F,p) 



(9.6-7) 
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where C is the Lagrangian function given by the unconditional cost 

C(F,p) := Ymi o £{y 2 {t) + pu 2 {t)} (9.6-8) 

and the Lagrange multiplier p satisfies the Kuhn-Tucker complementary con- 
dition 

p(lim £{u 2 {t)}-c 2 ^j = (9.6-9) 

For an unknown plant, Problem 1, or equivalently Problem 2, is to be solved by an 
adaptive control algorithm capable of selecting p and approximating an F which 
minimizes (8) under the constraint (9). 

Remark 9.6-1 Let r be the output set point and y(t) := y(t) — r the tracking error 
whose MS value £{y 2 (t)} has to be minimized in stochastic steady-state under the 
constraint (6). This problem can be cast into the above formulation by changing 
y(t) into y(t) and using the enlarged pseudostate 

8 r (t) := [ s'(t) r' }' 

and u(t) — F's r (t) instead of (3). In case the circumstances are such that £ {Su 2 (t) } < 
c 2 , Su(t) := u{t) — u(t — 1), is more suitable than (6), one can use the pseudostate 

s s (t) := [ y{t) ••• y(t-n) Su(t - 1) ••• 5u{t-m)]', 

and the control variable 5u(t) — F'ss(t) at the input of a CARIMA plant 

A(d)A(d)y(t) = B(d)6u(t) + C(d)e(t), (9.6-10) 

A(d) := 1 — d. This is an integral action variant of (l)-(6) with y(t) changed into 
y(t) and s(t) into s$(t), capable of insuring in stochastic steady-state rejection of 
constant disturbances. □ 



MS Input Constrained MUSMAR As a candidate algorithm for solving the 
problem stated above, the stochastic approximation approach proposed in [Toi83b] , 
combined with the MUSMAR algorithm, is considered. 

At each sampling time t, the MUSMAR algorithm selects, via the Enforced 
Certainty Equivalence procedure, u(t) so as to minimize in stochastic steady-state 
and in a receding-horizon sense the multistep quadratic cost (5-51). Next, the 
Lagrange multiplier p = p(t) is updated via the following recurrent scheme: 



p(t)=p(t-l)+e 1 (t)p(t-l)[u 2 (t-l) 



(9.6-11) 



in which e is a positive real and {7(i)} a sequence of real numbers, whose selection 
will be made clear in the sequel. CMUSMAR is, then, obtained as detailed below. 

CMUSMAR algorithm At each step t, recursively execute the following steps: 

i. Update RLS estimates of the closed-loop system parameters 6i, fa, Ti and 

A;, i = T 



6i{t) 

r 4 (t) 



0i(t-i) 
r^-i) 



+ K(t-T) y(t-T + i)- 
9i(t - l)u(t — T) — T'i(t - l)s(t - T)j (9.6-12) 
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and, for i = 2, • • • , T 



Ht) 



W(t-l) 
Ai(t-1) 



+ X(i-T) u(i-T + i-l)- 



A*i(t - l)u(t - T) — Aj(i - l)a(t - T) 



(9.6-13) 



In (12) and (13) K(t — T) = P(t -T+l)ip(t - T) denotes the updating gain 
associated with the regressors 



<p(j)~[ <3) s'U) }' 
and Hi(t) = 1 and Ao(f) = 0. 

ii. Update the control cost weight, p(t) by using (11) with 

7(t) = [K'(t-T)K(t-T)} 1/2 

iii. Update the vector of feedback gains F by 



(9.6-14) 



S(i) = ^ 2 (t) + p(i) 



T-l 



i + !>?(*) 



i=l 



F(t) = - 



1 

W) 



J20 l (t)T l (t) + p(t) ^((JAjW 



.i=i 



iv. Apply to the plant an input given by 

u(t) =f(t)s(t) +»?(*) 



(9.6-15) 

(9.6-16) 
(9.6-17) 

(9.6-18) 



where n is a zero-mean independent identically distributed dither noise inde- 
pendent of e and such that all moments exist. 

The dither presence is introduced so as to guarantee persistency of exitation ( Cf. Prob- 
lem 5-5). For T = 1, the above algorithm reduces to the constrained MV self-tuner 
given in [Toi83b]. 

ODE Convergence Analysis The algorithm introduced above is now analysed 
using the ODE method. We can associate to CMUSMAR the following set of ODEs 
as in (5-33): (i = 1, • • • , T and j = 2, • • • , T) 



H(t) 
Ai(r) 



= K- 1 {t) x (9.6-19) 
£ {<p(t) [y(t + i) - 9 t (r)u(t) - r-(r)s(t)] | F(r)} 

= K-^t) x (9.6-20) 
f{^)[ U (i + i-l)- Mj (T) U (t)+A;.(r)5(i)] | F(t)} 

K(t) = -K(t) + K v (t) (9.6-21a) 
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K v {t) = £ (*) I F(r)} (9.6-21b) 

p(T)=e P (T)[£{u 2 (t)}~c 2 ], (9.6-22) 

where ip(t) is as in (14). In (19) and (20), a dot denotes derivative with respect 
to t, and £{■} the expectation w.r.t. the probability density function induced on 
u(t) and y(t) by e and n, assuming that the system is in stochastic steady-state 
corresponding to the constant control law 

u{t) = F'(T)s(t) + r}(t) (9.6-23) 

and a constant p(r). Hereafter, in order to simplify the notation, the variable r, as 
well as the conditioning upon F(t) will be omitted. In order to obtain a differential 
equation for F, differentiate (17) with respect to t, 



F = F - i/j 



(9.6-24) 



where F denotes the derivative of F assuming p constant. In Proposition 5-2 it 
has been shown that the following ODE holds 



F = —Kj 1 TV T C + o(\\F\\), 



(9.6-25) 



where F := Fo — F *; F* denotes any equilibrium point; o(||x||) is such that 
lim-r^oIoGMIVIMI] = 0; 1Z S := £{s(t)s'(t) \ F(t)}. Finally, Vt£ is an approxima- 
tion to the gradient of C w.r.t. F which becomes increasingly tighter as T — ► 00. 
Thus, the ODE for F associated with CMUSMAR is 



f = —h^tvtC- 4p 



o(\\F\\ 



(9.6-26) 



and 



p = ep[£ {u 2 (t)} -c 2 ] (9.6-27) 

If F converges to a stabilizing control law, 1Z S converges to a strictly positive 
definite bounded matrix and p converges to a positive number, as pointed out in 
the previous section also the parameter estimates 9(t) := {8i(t),Ti(t), pi(t), Ai(t)} 
converge. Therefore, the only possible convergence points of CMUSMAR are given 
by the stable equilibrium points of (26) and (27). These are given by: 



(A) 
(B) 



0, 
0, 



p = 
£{u 2 (t)} 







(A) equilibria correspond to the extrema of the receding-horizon variant of the MS 
output cost for which the corresponding MS input is less than c 2 . (B)-equilibria 
correspond to the extrema of the receding-horizon variant of the MS output cost 
on the boundary of the feasibility region £{u 2 (t)} < c 2 . 

Stable (A)— equilibria We have the following result: 

Proposition 9.6-1. Consider the CMUSMAR algorithm with any controller com- 
plexity and any plant I/O delay smaller than T. Then, if T is large enough in the 
sense of (ii) of Theorem 5-1, among the (A) -equilibria, the only possible conver- 
gence points of CMUSMAR are the minima or ridge points of the MS output value 
in the feasibility region £{u 2 (t)} < c 2 . 
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Proof Eq. (26) and (27) are of the form 

(9.6-28) 
(9.6-29) 

In order to find the possibly locally stable equilibria of (28) and (29) the following Jacobian matrix 

dG 
~8p~ 



is considered 



F = G(F,p) 
p = eH(F,p) 



dG 
dF 



dH 



8H 



(9.6-30) 



'OF dp 

The entries of the Jacobian matrix at the (A)-equilibria are given by 







dG 

~dp~ 

e[£{u 2 (t)}-c 2 ] 



This, being upper block triangular with E > 0, e > and 1Z S > 0, corresponds to possibly locally 
stable equilibria, whenever V|£ > and £{u 2 (t)} < c 2 . 



Stable (B) equilibria Stability analysis of (B) equilibria appears to be a diffi- 
cult task since, in this case, the Jacobian matrix (30) need not be block diagonal. 
In such a case, we consider (28) and (29) for small positive reals e. Then, (28) 
and (29) can be regarded as a singularly perturbed differential system [Was65], 
[KK086] of which (28) and (29) describe the "fast" and, respectively, the "slow" 
states. 

Hereafter, the interest is directed to the (B)-equilibria at which V 2 £ > 0, viz. 
(B)-equilibria which are isolated minima of the cost (8) for a fixed po. Any such a 
(B) -equilibrium point will be denoted by (3 = [ Fq Po ] ■ 

Since the plant to be regulated is linear and time invariant, and, at every (3- 
point, the closed loop system is asymptotically stable, next property holds. 

Property 9.6-1 The functions G and H in (28) and, respectively, (29) are 
continuously differentiable in a neighbourhood of (3. □ 



Since, for every (3 

dG 

OF 



oc -V 2 T C\ +0{e) 





where lim e ^ O(e) = 0, for e small enough, 

< (9.6-31) 



dG 
OF 



Then the Implicit Function Theorem [Zei85] assures that the following property is 
satisfied: 

Property 9.6-2 In a neighbourhood of p , the equation G(f,p) = has an 
isolated solution F = F(p) with F(-) continuously differentiable. □ 

Setting t :— et, (28) and (29) become: 

s^ = G(F,p) and ^ = H(F,p) (9.6-32) 
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Property 9.6-3 Consider the "reduced system" 

^ = H(F(p),p)=p[u 2 (p)-c 2 } (9.6-33) 

Then p , such that u 2 (po) =: £{u 2 {t)} = c 2 , is an isolated equilibrium point at 
which (33) is exponentially stable. □ 

In order to prove Property 3, it will be shown by next Lemma 1 and Property 1 
that the following input MS monotonicity property holds 



dH 
dp 



du 2 ( P ) 
dp 



< o 



Lemma 9.6-1. Let u 2 (p) be the input MS value u 2 (p) := £{u 2 (p)} corresponding 
to an isolated minimum of the stochastic steady-state quadratic cost C(F, p) for a 
given p. Then, u 2 {p) is a strictly decreasing function of p in a neighbourhood of 
po, po being specified as in Property 2. 



Proof Let pi and p2, p2 > pi, be in a suitably small neighbourhood of po. Let, according to 
Property 2, Fj = F(pi) = argmin^? C(F, pi), i = 1, 2. Further let ft?, y? denote the corresponding 
stochastic steady-state MS values of the input and the output, respectively. Then, proceeding as 
in the proof of Theorem 7.4-1, we get that p2 > pi implies u\ — u\ < 0. 



Property 9.6-4 Consider, for fixed p, the "boundary layer system" 

dF 

-^ = G(F,p) (9.6-34) 

Then, (34) is exponentially stable at F — F{p) uniformly in p in a suitable neigh- 
bourhood oi po- □ 



Property 4 is fulfilled by virtue of (31). In fact, (31) implies, by Theorem 9.3 of 
[BN66], exponential stability at F = F(p). Next, (31), together with Property 2, 
implies that there exists a suitably small neighbourhood of po where the exponential 
stability referred above is uniform in p. 

Taking into account Properties 1-4, stability theory of singularly perturbed 
ODEs (Cf. Corollary 7.2.3 of [KK086]) yields the following conclusion. 

Theorem 9.6-1. Let the control horizon T of CMUSMAR be large enough w.r.t. 
the time constants of the closed loop system, and s(t) chosen so as to yield isolated 
minima of the cost. Then, there exists an e > such that, for every positive e < e, 
any feedback-gain solving Problem 1 (viz. minimizing the MS output value inside or 
along the boundary of the feasibility region £{u 2 {t)} < c 2 ) is a possible convergence 
point of CMUSMAR. 



Simulation results Proposition 1 and Theorem 1 suggest that CMUSMAR 
may possess nice convergence properties. However, there is no guarantee that 
CMUSMAR will actually be capable of converging to the desired points. In order 
to explore this point, we resort to simulation experiments. In all the experiments 
e is a zero-mean, Gaussian, stationary sequence with £{e 2 } = 1. 
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Figure 9.6-1: Superposition of the feedback time-evolution over the constant level 
curves of £{y 2 (t)} and the allowed boundary £{u 2 (t)} = 0.1 for CMUSMAR with 
T = 5 and the plant in Example 1. 

Example 9.6-1 CMUSMAR convergence properties are studied when the constrained minimum 
is different from the unconstrained one. Consider the nonminimum-phase open-loop stable fourth 
order plant of Example 5-4 and the restricted complexity controller 

«(*) = hv(t) + hv{t - i) 

The Lagrange multiplier p is initialized from a small value, viz. p = 10~ 4 , and T = 5 is the 
control horizon used. Since p grows slowly, the feedback gains initially approach the unconstrained 
minimum of £{y 2 (t)}. As p converges to its final value, the gains converge to a point close to the 
constrained minimum. 

Fig. 1 shows the superposition of the feedback gains with the constant level curves of £{y 2 (t)} 
and the boundary of the region denned by the restriction (6) with c 2 = 0.1. 

Example 9.6-2 As referred before, to ensure that CMUSMAR has the constrained local minima 
of the steady— state LQS regulation cost as possible convergence points, the horizon T must be 
large enough. In this example the plant of Example 7.4-1 is used, for which the control based 
on a single— step cost functional (T = 1) greatly differs from the steady— state LQS regulation. 
Consider the plant 

y(t + 3) - 2.75y(t + 2) + 2.61j/(t + 1) + 0.855j/(t) = 

u(t + 2) - 0.5«(t + 1) + e(t + 3) - 0.2e(i + 2) + 0.5e(t + 1) - 0.1e(i), 

and the full complexity controller defined by 

s(t)=[y(t) y(t-l) y(t-2) u(t-l) u(t-2) u(t-3) }' 

As shown in Example 7.4-1, using T = 1 and taking p as a parameter, this plant gives rise to 
a relationship between £{u 2 (t)}, and £{y 2 (t)}, which is not monotone (Fig. 7.4-1). Instead, for 
steady-state LQS regulation the relationship is monotone as guaranteed by Theorem 7.4-1. It 
can be seen from Fig. 7.4-1 that, in this example, the single— step ahead constrained self-tuner 
has two possible equilibrium points denoted B and D in Fig. 7.4-1. Both of them correspond to 
the same value of the input variance but to quite different values of the output variance. These 
equilibria can be attained depending on the initial conditions. 
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Figure 9.6-2: Time evolution of p and £{u 2 (t)} for CMUSMAR with T = 2 and 
the plant of Example 2. 



This unpleasant phenomenon is eliminated by increasing the value of T in CMUSMAR, (Fig. 2). 
In fact, the dotted line in Fig. 7.4-1, which exhibits a monotonic behaviour and corresponds to 
steady— state LQS regulation, is already tightly approached for T = 2. 



9.6.2 Implicit Adaptive MKI: MUSMAR-oo 

So far no extension of the celebrated weak self-tuning property of the 
RLS+MV adaptive regulator [AW73] was shown to exactly hold for steady-state 
LQS regulation. In this connection, however, the MUSMAR algorithm represents 
almost an exception, since it exhibits approximately the weak self tuning property, 
the approximation becoming sharper as T — > oo. Hereafter, we pose the following 
question: 



Is it possible to adaptively get the semiinfinite steady-state LQS regula- 
tion for any CARMA plant by using a finite number of predictors whose 
parameters are estimated by standard RLS? 

An adaptive regulation algorithm solving this problem is considered. It embodies 
a standard RLS separate identification of the parameters of T > n + 1 predictors 
of the I/O joint process, n being the order of the CARMA plant, together with 
an appropriate control synthesis rule. The proposed algorithm turns out to be a 
modified version of MUSMAR performing spread in time MKI (Cf. Sect. 5.7). 

In Sect. 5 it was shown that MUSMAR possible convergence points are close 
approximations to the minima of the adopted unconditional quadratic cost, even 
under mismatching conditions, provided that the prediction horizon T is chosen 
large enough. More precisely, T should be chosen such that 



,2{T-t) 



«L 



where t > 1 is the plant I/O delay, and Am the eigenvalue with maximum modulus 
of the closed-loop system (Cf. (5-57)). This implies that when |Am| is only slightly 
less than one, T must be very large so as to let all the transients decay within the 
prediction horizon. When |Am| is a priori unknown, there is no definite criterion 
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for a suitable a priori choice of T. In practice, T is chosen as a compromise between 
the algorithm computational complexity, which increases with T, and the stabiliza- 
tion requirement for generic unknown plants. In fact, the latter would impose, in 
principle, T = oo, and hence an unacceptable computational load, as well as an 
irrealizablc implementation. 

The above facts motivate the search for adaptive control algorithms based on a 
finite number of identifiers and which may yield a tight approximation to steady- 
state LQS regulation. For the deterministic case [SF81] and [OK87] proposed 
schemes in which a state space model of the plant is build upon estimates of the 
one-step ahead predictor. An estimate of the state provided by an adaptive ob- 
server is then fed back, the feedback gain being computed via spread-in-time Ric- 
cati iterations. Similar schemes have been developed for stochastic plants [Pet 86]. 
When CARMA plants are considered, RELS or RML identification algorithms must 
be used. This has the drawback that the inherent simplicity of standard RLS is 
lost. Here, "simplicity" refers not only to the computational burden, but mainly to 
the fact that both RELS and RML involve highly nonlinear operations in that their 
regressor depends not only on the current experimental data, but also on previous 
estimates. Along this line, it is interesting to establish whether the tuning proper- 
ties of the classical RLS+MV self tuning regulator can be extended to RLS+LQS 
adaptive regulation. 

Given the above motivations, the problem that we shall consider is the following: 

Is it possible to suitably modify the MUSMAR algorithm so as to adop- 
tively get steady-state LQS regulation for any CARMA plant by using 
a small number of predictors whose parameters are separately estimated 
via standard RLS estimators? 

Problem formulation The SISO CARMA plant (1) is considered. As usual, 
the order of the plant is denoted by n, 



Associated with the plant, we consider a quadratic cost defined over an iVT-steps 
horizon 



with p > 0, to be minimized in a receding horizon sense. In the sequel, it will 
become clear why in (35) the prediction horizon is denoted by NT. In fact, it will 
be convenient to increase the regulation horizon by holding T fixed and letting N 
to become larger. 

Our goal is to find a convenient procedure by which to adaptively select the 
input u(t) to the plant (1) minimizing the cost (35) as N — > oo, subject to the 
following requirements: 

• The feedback is updated on the grounds of the predictor parameters estimated 
by standard RLS algorithms; 

• The number of operations per single cycle does not grow with N. 

The following receding horizon scheme, for any T > n + 1 is considered to possibly 
achieve the stated goals. 



n = max{dA(d),dB(d),dC(d)}. 




(9.6-35) 
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MUSMAR oo algorithm 

i. Given all I/O data up to time t, compute RLS estimates of the parameter 
matrices 'J, 6 in the following set of prediction models: 

z{t) = Vz(t - T) + Qu(t -T)+ ((t) (9.6-36) 

where C(t) denotes a residual vector uncorrelated with both z(t — T) and 
u(t-T), 

z(t):=[s'(t) 7 ' (*)]', 

»(*) : = [ (sUh)' " 
== [ Kir 1 )' 

'J is a 2T x 2T matrix and 6 a 2T x 1 vector such that the bottom row of 
W is zero, the bottom element of 6 is 1, and the last 2(T — n) columns of 
are zero. 

ii. Update the matrix of weights P by the difference pseudo-Riccati equation 
{Cf. (5.7-9)) 

P = f'pPf F + n (9.6-37) 

where 

n :=diag{ p---p, p---p 



n n T-n T-n 

* F :=* + 6F' (9.6-38) 

and P and F are, respectively, the matrix of weights and the feedback vector 
used at time t — 1 . 

iii. Update the augmented vector of feedback gains by 

F = — (Q'PQy 1 *'P6 (9.6-39) 

with 6 and ^ replaced by their current estimates, and then apply the control 
at time t given by 

u(t) = F' s s(t) (9.6-40) 
where F s is made up by the first 2n components of F. 

iv. Set P — P, F = F, sample the output of the plant and go to step i. with t 
replaced by t + 1 . 

Remark 9.6-2 The estimation of the parameters in (36) is performed by first 
updating RLS estimates of the parameters 6i, Ti, i — 1,- • ■ ,T and pj, Aj, i = 
2, • • • , T in the following set of prediction models (Cf. (3-11)): 

y(t-T-i) = 9iu(t — T) + T' iS {t — T) + e,{t -T + i) (9.6-41a) 

u(t — T + = n. lU {t - T) + A' lS (t — T) + Vi(t — T + i — l) (9.6-41b) 
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This is accomplished with the formulae 



" 0i(t) ' 




' 0i(t-l) " 






. ri(t-i) _ 



+ K(t - T) x (9.6-42a) 
[y(t -T + i)- 6i(t - l)u(i — T) — rj(t - - T)] 







' w(t-l) " 


. Ai(t) _ 




. Ai(t-1) _ 



— T) x (9.6-42b) 
[u(t - T + i - 1) - - l)u(t -T) — Ai(t — l)s(t - T)] 

<p(t-T) = [ u(t-T) s'{t-T) ]' 

K(t-T) = P(t-T+l)<p(t-T) (9.6-42c) 

P" 1 ^ - T + 1) = P- 1 ^ - T)<p(t - T)ip'(t - T) (9.6-42d) 

In the above, 9i and fii are scalars, I\ and Aj are column vectors of dimension 2n, 
and €i(t + i) and Vi(t + i) uncorrelated with both u{€) and s(t). Note that since the 
regressor <p(t — T) is the same for all the models in the RLS algorithms, there is only 
the need to update one covariance matrix of dimension 2n + 1 . As pointed out for 
MUSMAR, this considerably reduces the numerical complexity of the algorithm. 
The estimates of the matrices and O are of the form: 



2T 





© = [ &T ■ ■ ■ Ot-u+1 fPT--- [lT-n+1 &T-n 



(9.6-43a) 

>2n 

>2{T-n) 



1 I^T—n ' ' ' [i 2 ] (9.6-43b) 

□ 



Remark 9.6-3 The vector F has dimension 2T. Given the structure of "f, with 
zeros on the last 2(T — n) columns, the last 2(T — n) entries of F are also zero. □ 

Justification of MUSMAR oo We show that, under suitable assumptions, 
the steady state LQS regulation feedback is an equilibrium point of MUSMAR-oo. 
Some required results drawn from Theorem 3-1 are summed up in the following 
lemma. 

Lemma 9.6-2. Let the inputs to the CARMA plant (1) be given by 

u{k) = F's(k) (9.6-44) 

or equivalently, for suitable polynomials R(d) and S(d), by 

R(d)u(k) = -S(d)y(k) (9.6-45) 

Let R and S be coprime and such that the closed-loop d- characteristic polynomial 
x(d) — A(d)R(d) + B(d)S(d) be strictly Hurwitz and divided by C(d): 



C(d) | X (d) 



(9.6-46) 
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Then, if the above conditions are fulfilled for 

k = t-n,---,t-l,t+l,---,t + T-l, (9.6-47) 
z(t + T) has the following representation 

z(t + T) = ^z(t) + 6u(t) + C(t + T) (9.6-48) 

where 

C(t + T) e Span {el+r} (9.6-49) 

Remark 9.6-4 The parameters in (48) characterize the dynamics of the closed- 
loop system. Therefore, they depend on the feedback gain polynomials R(d) and 
S(d), as well as on the plant and disturbance dynamics, defined by polynomials 
A(d), B(d) and (7(d). □ 

The interest in (48) is that (35), which may be written as 

£W*)} = ^{X>(* + ^)||* j (9.6-50) 

can be easily minimized w.r.t. u(t) provided that and 6 are known and suitable 
assumptions, to be discussed next, are made on the magnitude of T and past and 
future inputs. In fact, if (44)-(47) are assumed, (48) expresses z(t + T) in terms of 
u(t). Next, 

z(t + iT) = *z(t 1)T) + Qu(t 1)T) + ((t + iT) (9.6-51) 

also for all? > 2 if the inputs u(k) are given by the previous feedback law for 

k = t-n+(i-l)T,---,t + iT-l (9.6-52) 

Since (50) has to be minimized w.r.t. u(t), u(t) must be left unconstrained. Taking 
i = 2, it is seen that all the inputs between t+ (— n + T) and t + 2T— 1 (the shaded 
interval on Fig. 3) must be given by a constant feedback. Since u(t) must be left 
unconstrained, this implies (Cf. Fig. 3) 

T > n + 1 (9.6-53) 

Clearly, according to the definition of n, (53) already comprises the plant I/O delay. 
The above considerations are summed up in the following lemma. 

Lemma 9.6-3. Let assumptions (44)-(46) be fulfilled for 

k = t- n,- • -,t- l,t+ 1,- • -,t + NT- 1 

Then, ifT satisfies (53), z(t+iT), i = 1, 2, • • • N , has the state-space representation 
(51) irrespective of the plant C{d) innovations polynomial and the value taken on 
by u{t). 

Remark 9.6-5 Inequality (53) specifics in terms of the plant order n, the minimal 
dimension of the state z required to carry out in a correct way the minimization 
procedure under consideration. □ 
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inputs constrained to be given 
by a constant feedback 

t+(-n+T) t+2T-l 

w//////////////m mm 

t t+1 t+T t+T+l J+2T-1 t+2T t+2T+l t+NT 



u(t) must be 
left unconstrained 

Figure 9.6-3: Illustration of the minorant imposed on T. 



Thus, assuming (53), (51) can be used for all i > 1. For i > 2, using (44) in (51), 
one has 

z{t + iT) = ^ F z{t+{i-l)T)+C{t + iT) 

= z(t + iT) + z(t + iT) (9.6-54) 

where ^ f is as in (38), z(t + iT) is the zero-input response from the initial state 
z(t+2T), and z(t + iT) is the response due to ((t + iT) from the zero state at time 
t + 2T. Thus, taking into account that 

£ {z{t + iT)z'{t + iT)} = (9.6-55) 

and denoting (50) by Ciy(t,F), so as to point out that past and future inputs are 
given by a constant feedback, one has 

C N (t, F) = j^e{\\z(t + T)f Q + \\z(t + 2T)\\l {N) + V N (t, F)} (9.6-56) 

where 

N 

V N (t,F)=J2m + iT)f n 

»=3 

is not affected by u(t), and 

N-2 
i=l 

satisfies the following Lyapunov-type equation 

P(N) = y' F P(N)y F + tt- A(N) (9.6-57) 
with A(N) := Thus, the first two additive terms in (56) equal 

f{||^ + T)||^ + ||* F ^ + T)|| 2 p(jv) }=f{||^ + T)|| 2 p(JV)+A(JV) } 
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Consequently argmin u ( t ) Cjv(£, F) — F'(N)z(t) with 

F'(N) = - {Q' [P(N) + A(N)} Qy 1 6' [P(N) + A(N)} # (9.6-58) 

We now consider the minimization of C/v(£, F) w.r.t. u(t) as N — > oo. Being a 
stability matrix, we can define 

Coo(t,F):= lim C N (t,F) (9.6-59) 

JV— >oo 

Since 

P(iV) + A (TV) > P(A) > ft > e/, 

with < e < min(l,p), F(N) is a continuous function of P(N) + A(N). Conse- 
quently, since P(A) + A (A) — > P as A — > oo, one has 

p' := lim p'(iv) = - [e'pep 1 e'p* (9.6-60) 

with P solution of the following Lyapunov equation 

P = #' F PW F + ft. (9.6-61) 

Theorem 9.6-2. Under the same assumptions as in Lemma 3, the input at time t 
minimizing Coo(t, F) in (59) is given by u{t) = F'z(t), with F specified by (60) and 
(61). Further, if the procedure used to generate F from F is iterated, the steady- 
state LQS regulation feedback is an equilibrium point for the resulting iterations. 

Proof The validity of the last assertion basically follows from the properties of Klcinman iterations 
(Cf. Sect. 2.5). 

By the structure of matrix 'J, the last T — n elements of F are zero and thus 
(40) holds. Further, in order to circumvent difficulties associated with possible 
feedback vectors making temporarily the closed-loop system unstable, and hence 
the Lyapunov equation (61) meaningless, in MUSMAR-oo P is updated via the 
pseudo-Riccati difference equation (37). This change makes MUSMAR-oo un- 
derlying control law a stochastic variant of spread-in-time MKI as discussed in 
Sect. 5.7. 

Algorithmic considerations The matrix and the vector 6 can be partitioned 
in the following blocks: 



>2n 



>2{T-n) 



e 



>2n 



\2{T-n) 



with the bottom elements of \l/ 7 and 6 7 zero and one, respectively, i. e. the predic- 
tion model (36) does not impose any constraint on u(t). 
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Let P be partitioned accordingly: 

In 



P = 



Ps P 

P' P 1 



(9.6-62) 

\2(T-n) 



and the same for P. Then, a simple calculation shows that (37) and (39) can be 
simplified as follows 

p s = (* s + e s F s )'p s (* fl + e s F s ) + n s (9.6-63) 

and 

q' s p s ^s + e;o 7 * 7 
Fs = "e' s p s e s + e 7 n 7 e 7 (9 ' 6_64) 

with 

tt s := diag { 1, p_^pj\ 

n n 

Clj :— diag { 1 • • • 1 , p • • • p } 

T-n T-n 

and P s initialized by P s — fl s . 

The algorithm assumes p > 0. This in practice constitutes no restriction since 
p may be made as small as needed. 

Simulation results Some examples are considered in order to exhibit the fea- 
tures of the MUSMAR-oo algorithm. Comparisons are made with an indirect 
steady-state LQS adaptive controller (ILQS) based on the same underlying control 
problem as MUSMAR-oo, the difference being in the fact that the first identify 
the usual one-step ahead prediction model and next the steady-state LQS regu- 
lation law is computed indirectly. Both full and reduced complexity controllers 
are considered, the aim being to show that MUSMAR-oo is capable of stabilyzing 
plants requiring long prediction horizons, a feature due to its underlying regulation 
law, still retaining good performance robustness thanks to its parallel identification 
scheme. 

Example 9.6-3 A second order nonminimum-phase plant is adaptively regulated by MUSMAR 
and MUSMAR-oo. The plant to be regulated is 

y(t + 1) - 1.5j/(t) + 0.7y(t - 1) = u(t) - 1.01u(t - 1) + e(t + 1) 

with input weight p = 0.1 in the performance index (35). Here, and in the following examples, e 
is a sequence of independent Gaussian random variables with zero— mean and unit variance. 
Fig. 4 compares the accumulated loss divided by time, viz. 



iX> 2 « + p« 2 (;-i)] 



*i=i 

for MUSMAR (T = 3) and MUSMAR-oo (T = 3). In both cases a full complexity controller is 
used. Since this plant has a nonminimum-phase zero very close to the stability boundary, the 
prediction horizon T must be very large in order for MUSMAR to behave close to the optimal 
performance. For T = 3 (a value chosen according to the rule T = n + 1), MUSMAR-oo yields a 
much smaller cost than MUSMAR, exhibiting a loss very close to the optimal one. 
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Figure 9.6-4: The accumulated loss divided by time when the plant of Example 
3 is regulated by MUSMAR (T = 3) and MUSMAR-oo (T = 3). 



s(t) 



(9.6-66) 



(9.6-67) 



Example 9.6-4 Since MUSMAR— oo is based on a state-space model built upon separately 
estimated predictors, it turns out, according to Sect. 3, not to be affected by a C(d) polynomial 
different from 1 in the CARMA plant representation. In this example the following plant with a 
highly correlated equation error is considered: 

y(t + 1) = u(t) + e(t + 1) - 0.99e(t) (9.6-65) 

For p = 1, MUSMAR-oo converges to F s = [ 0.492 0.494 ]', the optimal feedback being 
F* = [ 0.495 0.495 ]. 

Example 9.6-5 MUSMAR was shown to be robustly self-optimizing, in the sense that, if T is 
large enough, and MUSMAR converges, it converges to the minima of the cost constrained to the 
chosen regulator regressor. MUSMAR-oo is expected to inherit this robustness property due to 
the fact that it is based on a set of implicit prediction models whose parameters are separately 
estimated. Consider the open— loop unstable plant 

y(t + 1) + 0.9y(t) - 0.5y(t - 1) = u(t) + e(t + 1) - 0.7e(i) 

Although the plant is of second order, and hence its pseudostate is 

[ y(t) y(t - 1) u(t - 1) u(t - 2) ] , 

s(t) is instead chosen to be 

y(t) 

u(t - 1) 

The optimal feedback constrained to the above choice of s(t) is, for p = 1, 

F* = [ 1.147 -0.109 ]. 

Fig. 5 shows the time— evolution of the feedback when the above plant is regulated by MUSMAR- 
oo on the space [ fi fi ] , superimposed on the level curves of the quadratic cost, constrained 
to the chosen regulator regressor. As is apparent, MUSMAR— oo is able to tune close to the 
minimum of the underlying cost, despite the presence of unmodelled plant dynamics. 

Example 9.6-6 This example aims at showing the importance of the separate estimation of the 
predictor parameters in MUSMAR-oo. A comparison is made with ILQS. 
Consider the fourth order, nonminimum-phase, open-loop stable plant 

y(t + 4) - 0.167y(t + 3) - 0.74y(t + 2) - 0.132y(t + 1) + 0.87«/(t) = 

= 0.132w(t + 3) + 0.545u(t + 2) + 1.117«(t + 1) + 0.262u(i) + e(t + 4) 

Fig. 6 and Fig. 7 show the results obtained when for this plant is used a reduced complexity 
regulator 

«(*) = fiv(t) + f2V(t - 1) + f 3 u(t - 1) + Uu(t - 2) 
and p = 10 -4 . Fig. 6 shows the time-evolution of the first three components of the feedback when 
MUSMAR-oo is used. Fig. 7 shows the accumulated loss divided by time, when ILQS, MUSMAR 
(T = 3) and MUSMAR-oo (T = 3) are used. Although both MUSMAR-oo and ILQS yield the 
steady-state LQS feedback under model matching conditions, due to the presence of unmodelled 
dynamics, ILQS presents a big detuning. MUSMAR-oo, instead, being based on a multipredictor 
model, tends to be insensitive to plant unmodelled dynamics. 
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Figure 9.6-5: The evolution of the feedback calculated by MUSMAR-oo in Ex- 
ample 5, superimposed to the level curves of the underlying quadratic cost. 
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Figure 9.6-6: Convergence of the feedback when the plant of Example 6 is con- 
trolled by MUSMAR-oo. 
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Figure 9.6-7: The accumulated loss divided by time when the plant of Example 
6 is controlled by ILQS, standard MUSMAR (T = 3) and MUSMAR-oo (T = 3). 

Main points of the section Implicit modelling theory can be further exploited 
so as to construct extensions of the MUSMAR algorithm. The first extension, 
CMUSMAR, embodies a mean-square input value constraint. ODE analysis and 
singular perturbation methods show that, when suitable provisions are taken, the 
local constrained minima of the underlying quadratic performance index are possible 
convergence points of CMUSMAR, also in the presence of unmodelled dynamics. 
In the second extension, MUSMAR-oo, implicit modelling theory is blended with 
spread-in-time MKI so as to realize an implicit adaptive regulator for which the 
steady-state LQS regulation feedback is an equilibrium point. 

Notes and References 

Adaptive LQ controllers have been considered in [SF81], [Sam82], [Gri84], [Pet84], 
[OK87] . The basic pole assignment approach to self- tuning has been discussed in 
[WEPZ79], [WPZ79] and [PW81]. In contrast with CC and MV control, both LQ 
and pole assignment control require the fulfillment of a stabilizability condition 
[DL84], [LG85] for the identified model. This can be insured by using a projec- 
tion facility to constrain the estimated parameters in a convex set containing the 
unknown parameters and such that every element of the set satisfies the stabi- 
lizability condition required for computing the control law. While the existence 
of such a set can be postulated for theoretical developments, in practice the def- 
inition of such convex sets for higher order plants is complicated and sometimes 
unfeasible. An alternative approach to deal with the stabilizability condition is to 
suitably enforce persistency of excitation as in [ECD85], [Cri87] and [Kre89]. We 
mainly borrow similar ideas for constructing, by using a self-excitation mechanism, 
the globally convergent adaptive SIORHC for the ideal case described in Sect. 1, 
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[MZ93], [MZB93]. Along similar lines, we construct the robust adaptive SIORHC 
for the bounded disturbance case by using a constant trace normalized RLS with 
a dead-zone and a suitable self-excitation mechanism. 

In the neglected dynamics case, combinations of data normalization, relative 
dead zones and projection of estimates onto convex sets have been proposed by 
many authors. E.g., see [MGHM88] and [CMS91]. In [WH92] it is shown that in 
the presence of neglected dynamics, projection of the estimates suffices to get global 
boundedness for adaptive pole assignment. An attempt to use the persistency of 
excitation approach in the neglected dynamics case of is described in [GMDD91]. 
The use of both low-pass prefiltering of I/O data for identification and high-pass 
dynamic weights for control design is in practice of vital importance in the presence 
of neglected dynamics. For a description of the first of these concepts expressly 
tailored for GPC see [RC89], [SMS91] and [SMS92]. 

The presentation in Sect. 2 of implicit multistep prediction models of linear- 
regression type is a simplified version of [MZ85] and [MZ89c] where the main results 
on this topics were first presented. See also [CDMZ87] and [CMP91]. For the dif- 
ficulties with the self tuning approach for general cost criteria see also [LKS85] . 
Though we use the notion of implicit linear-regression prediction models so as to 
provide a motivated derivation of MUSMAR, the introduction of the latter [MM80] , 
[MZ83], [Mos83], [GMMZ84], preceeded the discovery of the above implicit mod- 
elling property. Ever since its introduction, a great deal of simulative and applica- 
tion experience in case studies [GIM+90] has revealed MUSMAR self-optimizaing 
properties as a reduced-complexity adaptive controller. The local analysis of MUS- 
MAR self-optimizing properties in Sect. 5 appeared in [MZL89]. See also [Mos83], 
[MZ84] and [MZ87]. This study is based on the ODE method for analysing sto- 
chastic recursive algorithms [Lju77a], [Lju77b]. See also [ABJ+86] for nonstochastic 
averaging methods. 

The MUSMAR algorithm with mean-square input constraint was reported in 
[MLMN92] and its analysis is based on some results of singular perturbation theory 
of ODEs [Was65], [KK086]. MUSMAR-oo, the implicit adaptive LQG algorithm 
based on spread-in-time modified Kleinman iterations was introduced in [ML89] . 



Adaptive Predictive Control 



Appendices 



367 



APPENDIX A 

SOME RESULTS FROM LINEAR 
SYSTEMS THEORY 

In this appendix we briefly review some results from linear systems theory used 
in this book. For more extensive treatments standard textbooks — for example 
[ZD63], [Che70], [Bro70], [Des70a], [PA74], [Kai80] and [CD91] should be consulted. 

A.l State— Space Representations 

A discrete-time dynamic linear system is represented in state-space form as follows 

x(k+l) = <$>{k)x{k) + G{k)u{k) \ 

y(k) = H(k)x{k) + J{k)u{k) J [A - L ~ L > 

Here: ieffi; x(k) € R™ is the system state at time k; u(k) e R m and y(k) € W the 
system input and, respectively, output at time k; n is called the system dimension; 
and G(k), H(k), J(k) are matrices with real entries of compatible dimensions. 

The basic idea of state is that it contains all the information on the past history 
of the system relevant to describe its future behaviour in terms of the present and 
future inputs only. In fact from (1) we can compute the state x(j) given the state 
x(k), k < j, in terms of the inputs «[fc,j) only: 

x{j) = <p(j,k,x(k),u [k>j) ) 

:= fc)z(A:) + * + !)«(*) (A.1-2) 

i—k 

where 

^*) : ={i"(i -!)...*(*) 'ill (A.l-3) 

is the system state-transition matrix, and <p [j, k, x(k), U[fc,j)) the global 
state-transition function. Note that, by linearity of the system, the latter is the su- 
perposition of the zero-input response <p (j, k, x(k), On) with the zero-state response 

ip(j,k,x(k),u [klj )) = <p{j,k,x(k),On)) + <p(j,k,Ox,u [kl j)) (A.l-4) 
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<p(j,k,x{k),O n ) := ®(j,k)x{k) 
j'-i 

<p(j,k,O x ,u [kij) ) := l + 



(A.l-5) 
(A.l-6) 



i—k 



In the above equations Oq denotes the zero input sequence. 

Similar superposition properties hold for the system output response. In par- 
ticular, if the system is time-invariant, i.e. 



$(jfe) = $ G(fc) = G H(k) = H J(k) = J , VfceZ 
we have for i g S + 

y(fc + 1) = S^(fc) + ^ to(j)u(fc + i - j) 



where 
and 



w{j) :-- 



Si := H& 

J j = 

H&^G j > 1 



(A.l-7) 

(A.l-8) 

(A.l-9) 
(A.l-10) 



is the j-th sample of the system impulse response W := {w(j)}J^ 1 . 

For the time-invariant system £ = (<f>, G, i/, J) the following definitions and 
results apply. 

• A state x is reachable from a state x if there exists an input sequence U[o,jv) 
of finite length N which can drive the system from the initial state x to the 
final state x: 

x = f (iV,0,§,ti[ 0) jv)) 

S is said to be completely reachable, or ($, G) a reachable pair, if every state 
is reachable from any other state. This happens to be true if and only if every 
state is reachable from the state vector Ox- 

Theorem A. 1-1. S = ($, G, H, J) is completely reachable if and only if 

ranki? = n = dimS (A.l-11) 

where 



R:=[G $G 
is the reachability matrix ofT,. 



(A.l-12) 



Theorem A. 1-2. (GK canonical reachability decomposition) 

Consider the system S with reachability matrix R such that 

rank R — n r < n = dim S 

Then, there exist nonsingular matrices M which transform £ into £ of the 
form 

Mx(t) =: x(t) = [ x' r (t) x' f {t) ]', dimx r (t) = n r 



' x r (t+l) ' 




$ r <fr rf 




X r (t) 


+ 




_ Xr(t+1) _ 




$ f 




Xf(t) 




o 



u{t) 



(A.l-13a) 
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y(t) =[H r H f 



x r (t) 

Xf(t) 



+ Ju(t) 



(A.l-13b) 



with S r = (<J> r ,G r , H r , J) completely reachable. £ is said to be obtained from 
£ via a Gilbert-Kalman ( GK) canonical reachability decomposition. 

A state x is said to be controllable if there exists an input sequence U[o.jv) of 
finite length which drives the system state to Ox- 

if(N, 0, x, «[o,iv)) = O x 

£ is said to be completely controllable, or ($, G) a controllable pair, if every 
state is controllable. 

Theorem A. 1-3. The system £ is completely controllable if and only if ei- 
ther £ is completely reachable or the matrix <J> f resulting from a GK canonical 
reachability decomposition o/£ is nilpotent. 

A state x is said to be unobservable if the system output response, from the 
state x and for the zero input, is zero at all times: 



y(k) = H<f> k x = d 



(A.l-14) 



S is said to be completely observable, or ($, H) an observable pair, if the only 
observable state is the zero state Ox ■ 



Theorem A. 1-4. E 



where 



(<f>, G, i?, J) is completely observable if and only if 
rankG = n = dimi: (A.l-15) 



e 



is the observability matrix o/S. 



H 



(A.l-16) 



Theorem A. 1-5. (GK canonical observability decomposition) 

Consider the system X with observability matrix 6 such that 

rank 6 = n Q < n = dim T, 

Then, there exist nonsingular matrices M which transform S into S of the 
form 

Mx(t) =: x(t) = [ x' (t) x' s {t) ]' , dimx {t) = n Q 

u{t) (A.l-17a) 



(A.l-17b) 



" x Q (t+l) ' 




' $o ' 




X (t) 




' G ' 


_ Xg(t+l) _ 








Zo(i) 


+ 


G s 



y(t) = [h o 



x (t) 

Xo(t) 



+ Ju(t) 



with S = {<& ,G ,H , J) completely observable. S is said to be obtained 
from E via a GK canonical observability decomposition. 
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• £ is said to be completely reconstructible, or (<f>, H) a reconstructible pair, 
if every final state of £ can be uniquely determined by knowing the final 
output and the previous input and output sequences over intervals of finite 
but arbitrary length. 

Theorem A. 1-6. A system £ is completely reconstructible if and only if ei- 
ther £ is completely observable or the matrix <5>g resulting from a GK canon- 
ical observability decomposition of £ is nilpotent. 



A.2 Stability 

• The dynamic linear system (1) is said to be exponentially stable if there exist 
two positive reals a, A < 1, such that 

\\<p(t,t ,x(t ),O n )\\ <aX^^\\x(t )\\ (A.2-1) 

for all x(t ) € R™ and t > t . The system is asymptotically stable if 

lim $(t,t ,x(t ),On) = Ox 

for every x(to) <G R". If the system is time invariant, asymptotic stability is 
equivalent to exponential stability. A square matrix <f> is said to be a stability 
matrix if all its eigenvalues have modulus less than one: 

sp($) C(C S 

sp(<f>), the spectrum of being the set of the eigenvalues of <&, and(D s the 
unit open disc in the complex plane. 

Theorem A.2-1. The time-invariant dynamic linear system (1), (7), is 
asymptotically stable if and only if its state transition matrix $ is a sta- 
bility matrix or the d- characteristic polynomial of x<s>(d) ■— det(I — d&), 
is strictly Hurwitz. 

• S is said to be stabilizable, or (<f>, G) a stabilizable pair, if there exist matrices 
F G R mx ™ such that $ + GF is a stability matrix 

sp($ + GF) c(D s 

• S is said to be detectable, or ($, H) a detectable pair, if there exists matrices 
K e H nxp such that $ - KH is a stability matrix 

sp{$-KH) C(D S 

Theorem A. 2-2. S is stabilizable (detectable) if and only if either S is completely 
reachable (observable) or the matrix <£v resulting from a GK canonical reach- 
ability (observability) decomposition o/E is a stability matrix. 

The following stability property of slowly time varying systems is frequently used 
in the global convergence analysis of adaptive systems. 
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Theorem A. 2-3. [Dcs70b] Consider the linear dynamic system 

x(k + 1) = $(fc)x(fc) , k2 + (A.2-2) 
where the <j>(fc) are bounded stability matrices, viz. 

< M < oo and |A [$(ifc)]| < 1 - e , Vfc e 
wzi/i e > and A [$(&)] any eigenvalue of <fr(k). Then, provided that 

lim ||$(jfe) - *(fc - 1)|| = 0, (A.2-3) 

i/ie system (19) is exponentially stable. 

Note that (20) does not imply convergence of <&(fc). 



A. 3 State— Space Realizations 

Given the transfer function 



H(z) 



M"" 1 + ■■■ + &» 
z^ + aiz"- 1 H ha„ 



(A.3-1) 



we can find a state-space representation £ = ($, G, if), such that H(z) = H (zl n — $)" 
in the following straightforward way 



H = [ b n ■ ■ ■ h ] (A.3-2) 





r 

o 


In-l 







$ = 




G = 






. -a n 


—a n -i ■ ■ ■ — a\ 




. 1 . 



£ is said to be a realization of H(z). In particular, (A. 22) is the so-called canonical 
reachable realization of (A. 21). 

It is more difficult to find a realization £ of the impulse response sequence 
W = {wk}^ =1 , viz. a triplet (<I>, G, H) such that 

w k = H$ k - 1 G , V/ce^i (A.3-3) 

In this connection, a key point consists of considering the following Hankel matrices 



Hn = 



Wi 

U>2 



W 2 
W 3 



W N W N+ i 



W N 
WN+1 

W2N-1 



N € TL X 



(A.3-4) 



Then, the minimal dimension of the realizations of W equals the integer TV for 
which detH^ ^ and detifjv+i = 0, Vz e TL\. Realizations of minimal dimension 
are called minimal. A realization £ is minimal if and only if £ is completely 
reachable and completely observable. 
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APPENDIX B 

SOME RESULTS OF 
POLYNOMIAL MATRIX 
THEORY 



The purpose of this appendix is to provide a quick review of those results of poly- 
nomial matrix theory used in this book. For more extensive treatments standard 
textbooks — for example [Bar83] [BY83], [Kai80], [Ros70] and [Wol74] — should 
be consulted. 



B.l Matrix— Fraction Descriptions 

Polynomial matrices arise naturally in linear system theory. Consider the p x m 
transfer matrix 

H(z) =H(zI n -$)" 1 G (B.l-1) 

associated with the finite-dimensional linear time-invariant dynamical system (<!>, G, H). 
H{z) is a rational matrix in the indeterminate z, viz. a matrix whose elements are 
rational functions of z. Let £(z) be the least common multiple of the denominator 
polynomials of the H (^)-entries. Then we can write 

^"-f (R1 - 2) 

where N(z) is a (p x m) polynomial matrix. 

Eq. (B.2) can be also rewritten as a right matrix-fraction 



or a left matrix-fraction 



H(z) = N{z)M- 1 {z) \ 
M(z) := £(z)I m J 



H(z) = M- 1 (z)N(z) 
M(z) := £{z)I p 



(B.1-3) 



(B.1-4) 



Eq. (B.3) and (B.4) are two examples of right and left matrix-fraction descriptions 
(MFDs) of H(z). There are then many MFDs of a given transfer matrix. We are 
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interested in finding MFDs that are irreducible in a well-defined sense. In order to 
introduce the concept, we define the degree of a MFD N(z)M~ 1 (z) as the degree 
of the determinantal polynomial of its denominator matrix 

the degree of the MFD := <9[dct M(z)\ (B.l-5) 

Referring to (B.3) we find <9[detM(z)] = d[£ m (z)] = md£(z). Likewise, for the 
degree of the left MFD (B.4), we find d[det M{z)] = d[P>(z)\ = pd[£{z)]. Given 
an MFD, we shall see how to obtain MFDs of minimal degree. One reason for 
this interest is that MFDs of minimal degree are intimately related to minimal 
state-space representations. 



B.l.l Divisors and Irreducible MFDs 

From now on we shall only consider right MFDs. All the material can be easily 
duplicated to cover, mutatis mutandis, the case of left MFDs. 

Given a pair (M(z), N(z)) of polynomial matrices with equal number of columns 
and M(z) square and nonsingular, viz. detM(z) ^ 0, we call A(z), dimA(z) = 
dimM(,2), a common right divisor (crd) of M(z) and N(z) if there exist polynomial 
matrices M(z) and N(z) such that 

M(z) = M(z)A(z) and N(z) = N(z)A(z) (B.l-6) 

Since 

d[det M(z)} = d[det M(z)\ + d[det A(z)} (B.l-7) 

it follows that 

<9[det M(z)\ > d[det M(z)] (B.l-8) 

Further, N(z)M" 1 {z) = N(z)M- 1 (z). Therefore, the de gree of a MFD can be 
reduced by removing right divisors of the numerator and denominator matrices. 

A square polynomial matrix A(z) is called unimodular if its determinant is a 
nonzero constant, independent of z. For instance 



A(z) = 



z+l z 
z z — 1 



is unimodular since det A(z) = —1. A polynomial matrix A(z) is unimodular if 
and only if its inverse A _1 (z) is polynomial. 

We see from (7) that equality holds in (8) if and only if A(z) is unimodular. 

We say that A(z) is a greatest common right divisor (gcrd) of M(d) and N(d) 
if for any crd A(z) of M(z) and N(z), there exists a polynomial matrix X(z) such 
that 

A{z) = X{z)A{z) 

Let A(z) and A(z) be two gcrd's of M(z) and N(z). Then, for some polynomial 
matrices X(z) and Y(z) 



A{z) = X(z)A(z) 
AO) = Y(z)A(z) 



A(z) = X(z)Y(z)A(z) 



Hence, X(z) = Y~ 1 (z). It follows that: 

i. All gcrd's can only differ by a unimodular (left) factor; 
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ii. If a gcrd is unimodular, then all gcrd's must be unimodular; 

iii. All gcrd's have determinant polynomials of the same degree. 

M(z) and N(z) as in (6) are relatively right prime or right coprime if their gcrd's are 
unimodular. In such a case, the right MFD N(z)M~ 1 (z) is said to be irreducible 
since it has the minimum possible degree. 

B.1.2 Elementary Row (Column) Operations for Polynomial 
Matrices 

i. Interchange of any two rows (columns); 

ii. Addition to any row (column) of a polynomial multiple of any other row 
(column) ; 

iii. Scaling any row (column) by any nonzero real number. 

These operations can be represented by elementary matrices, prcmultiplication 
(postmultiplication) by which corresponds to elementary row (column) operations. 
Some examples are: 



" 


1 


" 






' A 2 . ' 


1 








A 












1 








(z) 


" 






' A 


l. + a(, 


1 





A 






A 2 . 





1 








As. 



where Ai. denotes the zth row of A and a(z) a polynomial. 

All the above elementary matrices are unimodular. Conversely prcmultiplica- 
tion (postmultiplication) by a unimodular matrix corresponds to the actions of a 
sequence of elementary row (column) operations. 



B.1.3 A Construction for a gcrd 

Given m x m and p x m polynomial matrices M(z) and N(z), form the matrix 
[M'(z)N'(z)] f . Next, find elementary row operations (or equivalcntly a unimodular 
matrix U(z)) such that p of the bottom rows of the matrix on the RHS of the 
following equation are identically zero 



U(z) 



M(z) 
N(z) 



>m 



A(z) 




(B.1-9) 



Then, the square matrix denoted A(z) in (9) is a gcrd of M(z) and N(z). In 
particular, to this end we can use the procedure to construct the Hermite form 
[Kai80]. 
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B.1.4 Bezout Identity 

M(z) and N(z) are right coprime if and only if there exists polynomial matrices 
X(z) and Y(z) such that 

X(z)M(z) + Y(z)N(z) = I n (B.l-10) 

B.2 Column— and Row— Reduced Matrices 

A rational transfer matrix H(z) is said to be proper if lim 2 ^ oc H{z) is finite, and 
strictly proper if lim^oo H(z) = 0. Properness or strict properness of a trans- 
fer matrix can be verified by inspection. If we refer to a MFD N(z)M~ 1 (z) the 
situation is not so simple. We need the following definition 



the degree of a 
polynomial vector 




the highest degree of 

all the entries of the vector. 



Let 

kj := dMj(z) : the degree of the j-th column of M(z) (B.2-1) 
Then, clearly 

m 

d[det M(z)} < J2 k 3 (B.2-2) 
j=i 

If equality holds in (B.12), we shall say that M(z) is column-reduced. In general, 
we can always write 

M(z) = M hc S{z) + L(z) (B.2-3) 

where 

S(z) := diag {A = 1,2, •••,m} 

Mhc '■— the highest-column-degree coefficient matrix of M(z) 

a matrix whose j th column comprises the coefficient of kj in the j-th column of 
M(z). Finally, L(z) denotes the remaining terms and is a polynomial matrix with 
column degrees strictly less than those of M(z). E.g. 



M(z) = 


' z 3 + 1 z 2 + 2 " 

z 2 + z 1 




" 1 1 " 






' z 3 
_ z 2 _ 


+ 


1 2 " 

z 2 + z 1 












S(z) 




L(z) 



Then 



det M(z) = (det M^ c )z^j kj + terms of lower degree in z (B.2-4) 
Therefore, it follows that 

A nonsingular polynomial matrix is column-reduced if and only if its highest- 
column-degree coefficient matrix is nonsingular. 

Properness of N(z)M~ 1 (z) can be established provided that M(z) is column (row)- 
reduced. 
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If M(z) is column-reduced, then N(z)M~ 1 (z) is strictly proper (proper) if 
and only if each column of N(z) has degree less than (less than or equal to) 
the degree of the corresponding column of M(z). 

Any nonsingular polynomial matrix can be made column (row)-reduced by using 
elementary column (row) operations to successively reduce the individual column 
(row) degrees until column (row)-reducedness is achieved. E.g., taking M(z) as in 
the equation above (B.14), 



M(z) 



1 " 




-z 1 










s. 



-22+1 z 2 + 2 



-z 3 + z 2 -l 1 



M, 



1 " 




' z 3 " 




" -2z+ 1 2 " 


-1 




_ z 1 _ 


+ 


z 2 - z 1 _ 






S(z) 




L(z) 



and we see that M(z) is column-reduced. Then, given a nonsingular M(z), there 
exist unimodular matrices W(z) such that M(z) — M(z)W(z) is column-reduced. 
Therefore, any right MFD N(z)M~ 1 (z) of H(z) can be transformed into a right 
MFD with column-reduced denominator matrix. In fact, H(z) = N(z)W(z)[M(z)W(z)]~ 1 
N{z)M~ 1 {z) with N(z) := N(z)W(z) and M(z) := M(z)W(z). 



B.3 Reachable Realizations from Right MFDs 

W.l.o.g. we assume that the right MFD H(z) = N(z)M^ 1 (z) has column-reduced 
denominator matrix. It is also assumed that H(z) is strictly proper. We note that 
a system having transfer matrix H{z) can be represented in terms of a system of 
difference equations as follows 

m . iv W M- w ««) ~ { M( «<;» : <% m (B.3-D 

where: z is now to be interpreted as the forward-shift operator zy(t) := y(t + 1); 
and e R m is called the partial state. Let i = 1, ■■■,m, be the z-th 

component of ^(t). Define: 

a:(t) := [£i(t) • • • + fci) ■ ■ ■ ■ ■ ■ + M]' e R E * fci (B.3-2) 
Then, a state-space realization ($, G, if) with state x(t) of dimension (C/. (B.14)) 

dim* = J2 k * = ddetM(z) (B.3-3) 

z 

can be constructed ((7/. [Kai80]) with the following properties: 

i. ($, G) is completely reachable; 

ii. (<f>, i?) is completely observable if and only if N(z) and M(z) are right co- 
prime; 

iii. x$0) == det(zl - $) = (det M^)" 1 det M(z). (B.3-4) 
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B.4 Relationship between z and d MFDs 

Let 

H(z) = N{z)M~ 1 {z) 
= HizI-fy^G 

with M(z) = Mhc S(z) + L(z) column-reduced and such that 

n := dim$ = ddet M(z) 

and 

X<s>(z) := det (2 J - $) = (det Mhc)' 1 det M (,z) 

We have 

M (z)5(z~ 1 )M h " c 1 = I m + L{z)S{z-*)M£ =: M(d) l^-i (B.4-1) 
Similarly, 

N{z)S{z-*)M£ =: N(d)\ d=z -! (B.4-2) 

In the above equations we have used the fact that ^(z^ 1 ) = S^ 1 (z). We can see 
that, being M(z) column-reduced, M(d) and N(d) are polynomial matrices in the 
indeterminate d. Further, 

N(d)M~ 1 (d) = Hid- 1 ) = ^(rf" 1 7„-$)- 1 G 

= Hiln-d^dG [aA ~ 6) 

Then, iV^M" 1 ^) is a right MFD of the d-transfer matrix Jf (/„ - dty^dG 
associated with (®,G,H) [Cf. (3.1-28)]. Further, we find 

detM(d) = (det M fcc ) _1 det M(d _1 ) det 5(d) [(19)] 

= d^ fe * det(d- 1 /„ - $) [(18)] (B.4-4) 
= det(7„ - d$) =: x*(d) 

Finally, if M(z) and N(z) are right coprime, M(d) and N(d) turn out to be such. 
Then, if H(d) = H (I n - d$)- 1 dG has an irreducible right MFD N(d)M~ 1 (d), Fact 
3.1-1 follows. 

B.5 Divisors and System— Theoretic Properties 
PBH rank tests [Kaiso, p. 136] 

i. A pair ($, G) is reachable if and only if 

rank [ zl n - $ G ] = n for all z e<D (B.5-1) 

ii. A pair (7?, $) is observable if and only if 



rank 



H 

Zln - $ 



ra forallz€(D (B.5-2) 
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Setting d :— z 1 , 

zI n -& G 

Then, wc have the following results: 



I n - d<$> dG 



iii. rank [ I n - d$ dG ] = n for all rfe(D (B.5-3) if and 

only if the pair ($, G) is controllable, i.e. it has no nonzero unreachable 
eigenvalue. 

It is to be pointed out that (25) is equivalent to right coprimcness of the polynomial 
matrices A{d) := /„ — d<& and B{d) := dG. 



iv. rank 



n for all de(D 



(B.5-4) 



H 

if and only if the pair (H, $) is reconstructible, i.e. it has nonzero un- 
observablc eigenvalue. 



It is to be pointed out that (26) is equivalent to left coprimeness of the polynomial 
matrices I n — <i<I> and H. 

The following properties, which can be easily verified via GK canonical de- 
compositions, relate reducible MFDs to system-theoretic attributes of the triplet 



v. The GCLDs of /„ — d& and dG are strictly Hurwitz if and only if the 
pair ($, G) is stabilizablc. 

vi. The GCRDs of I n — d<fr and H are strictly Hurwitz if and only if the 
pair (H, $) is detectable. 
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SOME RESULTS ON LINEAR 
DIOPHANTINE EQUATIONS 

The purpose of this appendix is to provide a quick review of those results on linear 
Diophantine equations used in this book. For a more extensive treatment, the 
monograph [Kuc79] should be consulted. Diophantus of Alexandria studied in the 
third century A.D. the problem, isomorphic to the one of next equation (C.l), of 
finding integers (x, y) solving the equation ax+by = c with a, b and c given integers. 



C.l Unilateral Polynomial Matrix Equations 

Let R pm [d] denote the set ofpxm matrices whose entries are polynomials in the 
indeterminate d. We consider cither the equation 

A{d)X{d) + B{d)Y{d) = C(d) (C.l-1) 

or the equation 

X(d)A(d) + Y{d)B(d) = C(d) (C.l-2) 

In (1), A(d) e Rp P [d] and nonsingular, and C(d) € R pp [d]. In (2), A(d) € R mm [d] 
and nonsingular, and C(d) G R mm [d\. In both (1) and (2), B(d) e P pm [d], and 
X(d) and Y(d) are polynomial matrices of compatible dimensions. By a solution 
we mean any pair of polynomial matrices X(d) and Y(d) satisfying either (1) or 
(2). 

Result C.l-1. Equation (1) is solvable if and only if the GCLDs of A{d) and B(d) 
are left divisors of C(d). Provided that (1) is solvable, the general solution of (1) 
is given by 

X(d) = X (d) - B 2 (d)P(d) (C.l-3) 

Y{d) = Y {d) + A 2 {d)P{d) (C.l-4) 

where: (X (d), Y (d)) is a particular solution of (1); A 2 (d) and B 2 {d) are right 
coprime and such that 

A-\d)B(d) = B 2 {d)A 2 1 (d) (CIS) 
and P(d) G R mp [d] is an arbitrary polynomial matrix. 
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Result C.l-2. Equation (2) is solvable if and only if the GCRDs of A(d) and B(d) 
are right divisors of C(d). Provided that (2) is solvable, the general solution of (2) 
is given by 

X(d) = X (d)- P(d)B 1 (d) (C.l-6) 

Y(d) = Y (d) + P{d)A 1 {d) (C.l-7) 

where: (X (d), Y (d)) is a particular solution of (2); A\(d) and Bi(d) are left 
coprime and such that 

B(d)A- 1 (d) = A- 1 (d)B 1 (d) (C.l-8) 
and P(d) € R mp [d] is an arbitrary polynomial matrix. 

In applications, we are usually interested in solutions of cither (1) or (2) with 
some specific properties. In particular, we consider the minimum-degree solution of 
(1) w.r.t. Y(d). By this, we mean, whenever it exists unique, the pair (X(d), Y(d)) 
solving (1) with minimum dY(d). Here dY(d) denotes the degree of Y(d) 

Y(d) = Y + Y 1 d+--- + Y dY d dY , Yoy^O 

with Yq, Yi, • • • , Yqy constant matrices. We say that a square polynomial matrix 

Q(d) = Qo + Qid+--- + Q dQ d dQ 

is regular if its leading matrix coefficient Qqq is nonsingular. 

Result C.l-3. Let (1) be solvable and A 2 (d) regular. Then, the minimum-degree 
solution of (1) w.r.t. Y(d) exists unique and can be found as follows. Use the left 
division algorithm to divide A 2 (d) into Y (d): 

Y (d) = A 2 (d)Q(d) + T(d) , dT(d) < dA 2 (d) (C.l-9) 

Then, (4) becomes 

Y(d) = A 2 (d)[Q(d) + p{d)\ + r(d) 

Hence, the required minimum- degree solution is obtained by setting 

P(d) = -Q(d) (C.l-10) 

to get 

X(d) = X (d) + B 2 (d)Q(d) (C.l-11) 

Y(d) = T(d) (C.l-12) 

Result C.l-4. Let (2) be solvable and A\(d) regular. Then, the minimum- degree 
solution of (2) w.r.t. Y(d) exists unique and can be found as follows. Use the right 
division algorithm to divide A\(d) into Y (d): 

Y {d) = Q(d)A 1 (d) + T(d) , dT(d) < dA^d) (C.l-13) 

Then, (7) becomes 

Y(d) = [Q(d) + P(d)]Ai(d) + r(d) 
Hence, the minimum- degree solution of (2) w.r.t. Y(d) is given by 

X{d) = X {d)+Q{d)B 1 {d) (C.l-14) 
Y(d) = T(d) (C.l-15) 
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C.2 Bilateral Polynomial Matrix Equations 

In this book, we shall find various bilateral polynomial matrix equations of the 
form 

E{d)X{d) + Z(d)G(d) = C{d) (C.2-1) 

where E(d) £ R mm [d], G(d) e R nn [d], C(d) E R mn [d], and X(d) and Z(d) are 
unknown polynomial matrices of compatible dimensions. Solvability conditions 
for (16) arc more complicated than the ones for (1) and (2). However, we shall 
only encounter (16) in the special case where G(d) is strictly Hurwitz and E(d) 
anti Hurwitz. This implies that 

det E(d) and det G(d) are coprime polynomials (C.2-2) 

Result C.2-1. Provided that (17) is fulfilled, (16) is solvable. Further, the general 
solution of (16) is given by 

X(d) = X (d) + L(d)G(d) (C.2-3) 
Z(d) = Z (d) - E(d)L(d) (C.2-4) 

where (X (d), Z (d)) is a particular solution of (16) and L(d) g R mn [d] is an 
arbitrary polynomial matrix. 

Result C.2-2. Let (16) be solvable and E(d) regular. Then, the minimum-degree 
solution of (16) w.r.t. Z(d) exists unique and can be found as follows. 
Use the left division algorithm to divide E(d) into Z (d): 

Z (d) = E(d)Q(d) + T(d) , dT(d) < dE(d) (C.2-5) 

Then, (18) becomes 

Z(d) = E(d)[Q(d) - L(d)] +T(d) 
Hence, the desired minimum- degree solution is obtained by setting 

L(d) = Q(d) (C.2-6) 

to get 

X(d) = X (d) + Q(d)G(d) (C.2-7) 

Z{d) = T{d) (C.2-8) 
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PROBABILITY THEORY AND 
STOCHASTIC PROCESSES 



The purpose of this appendix is to provide a quick review of those concepts and 
results from probability and the theory of stochastic processes used in this book. 
For more extensive treatments standard textbooks — for example [Cra46] , [Doo53] , 
[Loe63], [Chu68] and [Nev75], should be consulted. 



A probability space is a triple (ft, T , P) where: ft, the sample space, is a nonempty 
set of elements u, T is a a-field (or a a-algebra) of subsets of f2, viz. a collection of 
subsets containing the empty set and closed under complements and countable 
unions; P is a probability measure, viz. a function P : T — > R satisfying the 
following axioms 



Any element of T is called an event, in particular, f2 and are sometimes referred 
to as the sure and, respectively, the impossible event. 

Given a family S of subsets of il, there is a uniquely determined a field, denoted 
a(S), on Q which is the smallest cr-ficld containing S. <r(S) is called the a-field 
generated by S. 

D.2 Random Variables 

Let (fi, T , P) be a probability space. A real valued function v(u>) on f2, v : il — > R 
is called a random variable if it is measurable w.r.t. JF, mz. the set {ui \ v{ui) € 1Z} 
belongs to F for every open set ReR. 

Let v be a random variable such that J n |v(w)|cLP(a>) < oo. Then, its expected 



D.l Probability Space 



P(A) > o , 





P(fi) = l 
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value or mean is defined as 

£{v} := [ «(w)dP(w) 
Jn 

The mean of v k (cj) is called the k-th moment of v(u). From the Cauchy-Schwarz 
inequality [Luc69] it follows that the existence of the second moment £{v 2 } of v(uj) 
implies that its mean £{v} does exist. The quantity 

Var(i;) := £{v 2 } - [£{v}} 2 

is called the variance of v. Whenever this, or equivalently £ {v 2 }, exists, v(u) is 
said to be square-integrable or to have finite variance. 

Consider the set of all real-valued square-integrable random variables on (fi, T P). 
This set can be made a vector space over the real field under the usual operation 
of pointwise sum of functions and multiplication of functions by real numbers. 
Given two square-integrable random variables u and v, set (u,v) := £{uv} and 
||u|| := +\/ (u, u). Let now u denote the equivalence class of random variables on 
(ft, T , P), where v is equivalent to u if ||u — w|| 2 = £{{u — v) 2 } = 0, i.e., u denotes 
the collection of random variables that are identical to v except on a set of zero 
probability measure. With such a stipulation, (•,•) satisfies all the axioms of an 
inner product. The above vector space of (equivalence classes of) square-integrable 
random variables equipped with the inner product (•, •) is denotes by L 2 (fl,J r , P). 
It is an important result in analysis [Roy68] that L 2 (f2, P) is a Hilbert space. 

Let v : — > Pi™ be a random vector with n finite variance components. Then, 
v is called a finite variance random vector, and its mean v := £{v} and covariance 
matrix 

Cov(u) := £ {(v — v) (v — v)'} 

are well defined. Further, if V — Cov(v), we have V = V > 0. Setting v — v + v, 
with v := £{v} and v :— v — v, and using the fact that £ {v} — O n , the following 
lemma is easily proved. 

Lemma D.2-1. Let v : ft — > R™ be a finite variance random vector. Let G be an 
n x n matrix. Then 

£ {v'Gv} = v'Gv + Tr [GCov(v)] . (D.2-1) 

The probability distribution (function) of v is a function P v : H — > [0, 1] defined 
as follows 

P v (a) := P ({u> | v(u) < a}) , a e R 
P v is clearly nondecreasing, continuous from the right, and 

lim P v (a) = , lim PJa) = 1. 

If P v (a) is absolutely continuous w.r.t. the Lcbesgue measure [Roy68], then there 
exists a function p v : R — > R + , called the probability density (function) of w such 
that 

/•a 

i\,(a)= / p„(/3)d/3 
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v(uj) := [ Vi(ui) ■ ■ ■ v n (u>) ]' is an n-dimcnsional random vector if its n compo- 
nents Vi(ui), i e n, arc random variables. In such a case the probability distribution 
is a function P v : R™ -> [0, 1] 

P„(a) := P ({w | < at, i e n}) a = [ ai ••• a„ ]'eR" 

As for a single random variable, if P v is absolutely continuous we have 

P v (a)= f 1 ■■■ [ an p v (/3)df3 0=[0i ••• A,]'eR" 

7 — oo J —oo 

with p„ the probability density of the random vector u. 

D.3 Conditional Probabilities 

The events A\, ■ ■ ■ , A n arc independent if 

P (A! n A 2 ■ ■ • n A n ) = P (Ai) x P (4 2 ) x • • • x P 
The conditional probability of A given B, A,B e JF, is defined as 

P(A | B) := F p^^ provided that P(P) > 0. 

Note that P(A | £?) = P(^4) if and only if .A and B are independent. Further, 
P(- | B) is itself a probability measure. Thus if v is a random variable defined on 
(ft, J 7 , P), we can define conditional mean of u given i? as 



£{v \B}= f v(cj)dF(uj | B) 

Jf2 



More generally, let A, A C J 7 , denote a sub-cr-field of T, viz. a family of 
elements of T which also forms a cr-field. The conditional expectation of v w.r.t. 
A, or given A, denoted £{v \ A}, is a random variable such that 

i. £{v | ^4} is ,4-measurable 

ii. J£{v | A}dP(oj) = Jv(uj)dP(uj) for all A E A 

A A 

If A is the er-ficld generated by the set of random variables {v\, ■ ■ ■ ,v n }, A = 
a {vi, • • • , v n }, we write 

£{v | A} = £{v | vi,- - ■ ,«„} 
Properties of conditional expectation are 

i. If u = £{v | .4} and w = £{v | ^4}, then u = w a.s. (where "a.s." means 
almost surely, i.e. except on a set having probability measure zero) 

ii. If u is A measurable, then 

£{uv | .4} = u£{v | A} a.s. 

iii. £{uv | ^4} = £{u}£{v \ A} if u is independent of every set in A. 

iv. {Smoothing properties of conditional expectations). If Tt-i, Ft are two sub- 
tr-fields of T such that T t -\ C .Ft, then 

£{£{v|.F t _i}|.Ft} = £{«|.Ft-i} a .s. 
£{£{v\Tt\\T t -x} = £{v\T t -x} a.s. 
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D.4 Gaussian Random Vectors 

A random vector v with n components is said to be Gaussian if its probability 
density function p v (a), a € R™, equals 



Pv{a) 



[(27r) n det Vr 1/2 exp 



1 



(D.4-1) 



= : n(v,V) 



The function n(v, V) is the Gaussian or Normal probability density of v with mean 
v and covariance matrix V, the latter assumed here to be positive definite. 

Result D.4-1. Let v be a Gaussian random vector with probability density n{v, V). 
Then u{uS) = Lv(u>)+£, with L a matrix and £ a vector both of compatible dimension, 
is a Gaussian random vector with probability density n(u, U) where 



Lv + £ 



and 



U = LVL' 



Result D.4-2. Let v be a Gaussian random vector with probability density n{v, V). 
Let v, v and V be partitioned conformably as follows 



Vi(ll>) 

v 2 (uj) 



V = 



Vl 
V 2 



V = 



V n V 12 

V\2 V 22 



Then Vi, i — 1,2, ■■■ has (marginal) probability density n(vi,Vu). Further, the 
conditional probability density of v\ given v 2 is Gaussian and given by 



Pvi\v 2 =n(v 1 + V 12 V 22 1 (v 2 - v 2 ), Vu - V 12 V 22 V 2 i) . 



D.5 Stochastic Processes 

A discrete-time stochastic process v = {v(t,u>),t € T}, T C TL, is an integer- 
indexed family of random vectors defined on a common underlying probability space 
(ft, T, P). To indicate a stochastic process we use interchangeably the notations 
{v(t,u;)}, {v(t)} or simply v. For fixed t, v(t, •) is a random variable. For fixed u), 
v(-, oj) is called a realization or a sample path of the process. 

The mean, v(t), and the covariance, K v (t,r), of the process are defined as 
follows 

v(t) :=£{v(t, lu)} 

K v (t, t) := £ {[v(t, cj) ~ v(t)] [v(t, w) - v(t)}'} 

If v(t) = v and K v (t,r) — K v (t + k,r + k), \/k, t + k, t + k e T i.e. mean and 
covariance are invariant w.r.t. time shifts, we say that the process v is wide-sense 
stationary. In such a case, abusing of the notations, we write K v (t) in place of 
K v {t\,t 2 ), where r := t\ — t 2 . If K v {t) = ^(J^o we say that the process is white. 

The sequence of random vectors and cr-fields {v^),^}, t & with Ft C T , 
is called a martingale if 

i. T t C T t +\ and v{t) is ^-measurable (the latter condition is referred to by 
saying that {v(t)} is {Tt}- adapted) 



ii. 5{||t,(t)||}<00 
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iii. £{v(t+l) | F t ) = v(t) a.s. 
If, instead of the equality in iii. , we have 

£{v{t+l) | T t } > v(t) a.s. 
{v(t),Ft} is said to be a submartingale. It is called a supermartingale if 

£{v(t+l) | T t } < v(t) a.s. 

If iii. is replaced by 

£{v(t + l) \F t } = a.s. 
{v(t),Tt} is called a martingale difference. 

D.6 Convergence 

i. A sequence of random vectors {v(t,u)),t G ^+} is said to converge almost 
surely (a.s.), or with probability one, to v(u>) if 



P 



\ lu | lim v(t, u) = v(lu) > = 1 



ii. {u(t, G S + } converges in probability to if, for all e > 0, we have 
limP{w | - v(oj)\\ > e) = 

t— too 

iii. {t;(i,u;),i G converges in v-th mean [y > 0) to v(u;) if 

5{|Kt,a;)- V (a;)|r} = 
If f = 2 we say that convergence is in quadratic mean, or mean-square. 

iv. {u(t, G S + } converges in distribution to if 

lim P„ t (a) = P„(a) 

t— >oo 

at all the continuity points of P v (-). 
We point out that the well-known Markov inequality 

for any e, zv > 0, shows that the convergence in z/- th mean implies convergence in 
probability. The following connections exist between the above types of convergence 



v(t) — > v 

in probability in distribution 



v(t) — > u — > u 



in i/-th mean 
Some convergence results are listed hereafter. 
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Lemma D.6-1 (Martingale stability). Let {v(t), Tt\ be a martingale difference 
sequence. Then 



oo 

^-5{Kt)n^_ 1 }<oo a.s. 
t=i 

for some < p < 2, implies that 

1 N 

lim — > v(t) = a.s. 



t=l 

Proposition D.6-1. Let {v{t),ft} be a martingale difference sequence. Then 

£{v 2 (t) \T t - 1 ) = o 2 a.s. 

and 

£ {v^(t) | T t -i) < M < oo a.s. 

imply that 

1 N 

lim — > v 2 (t) — a 2 a.s. 



t=i 



Proof The result follows from Lemma 2. Set u(t) := v 2 (t) - £{v 2 (t) \ T t -l} = v 2 (t) - a 2 . Then 
£{u(t) | Tt-l) = 0. Hence, {u(t),Ft} is a martingale difference sequence. Also 



t=l t=l 

< ^l (M - CT 4 )<co 

t=l c 

The following positive supermartingale convergence result is important for conver- 
gence analysis of stochastic recursive algorithms (Cf. Theorem 6.4-3). 

Theorem D.6-1. (The Martingale Convergence Theorem) Let {v(t), a(t), 
P(t), t G ^+} be three sequences of positive random variables adapted to an in- 
creasing sequence of a -fields \Tt,t G ^+} and such that 

E {v(t) | T t -i} < v(t - 1) - a(t - 1) + 0(t - 1) a.s. 

with 

oo 

^/3(t)<oo a.s. 
t=o 

Then {v(t),t G TL+\ converges a.s. to a finite random variable 

lim v(t) = v < oo a.s. 

t^OO 

and 

oo 

^ a(i) < oo a.s. 
t=o 

The following property is used in Chapter 6 to establish convergence results 
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Result D.6-1 (Kronecker Lemma). Let {a(t)} and {b(t)} two real-valued se- 
quences such that 



k 



lim a(t) < oo 

h — ton ' ^ 
£=1 

{b(t)} is nondecreasing and lim t _>oo b(i) = oo Then 

k 

D.7 Minimum Mean— Square— Error Estimators 

Consider the square intcgrablc random variables w and on a common prob- 

ability space (ri,^ 7 , P). Let A = <r(y), be the sub-cr-ficld of T generated by the 
components of y := [ yi ■■■ y n ]'. Then L 2 (A) := L 2 (fl,A, P) is a closed 
subspace of L 2 (T) — L 2 (£l, J 7 ,! 3 )- Its elements can be conceived as all square- 
integrable random variables given by any nonlinear transformation of y. We show 
that the conditional mean £{w \ y} — £{w \ A} enjoys the following property 

£{w\y} = &rg min £ {{w — v) 2 \ (D.7-1) 

veL 2 (A) 

In fact, setting w — £{w \ y}, 

£{(w-v) 2 } = £ {[(w - w) + (w - v)} 2 } 

= £ {O - w) 2 + (w - v) 2 } + 2£ {(w - to) (to - v)} 
= £ {(w - w) 2 } + £ {(w - v) 2 } 
> £{(w-w) 2 } 

The third equality above follows since by the smoothing properties of conditional 
expectations 

£ {(w — v)(w — w)} — £ {£ {(w — v)(w — w) \ y}} 

= £ {(w — v)£ {w — w | y}} = 

The RHS of (3) is called the minimum mean-square error (MMSE) or minimum 
variance estimator of w based on y. Hence (3) shows that the MMSE estimator of 
w given y is given by the conditional mean £{w | y}. The latter can be interpreted 
as the orthogonal projection of w € L 2 {T) onto L 2 (a(y)). 
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